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PiarTPROCA I. SnRTRA C TTOW DIFFERENTIAL DISPLAY 

This application is a continuation-in-part of U.S. Serial 
No. 09/197,889, filed November 23, 1998, which is a 
continuation-in-part of U.S. Serial Application No. 
09/185,115, filed November 3, .1998 which is a 
continuation-in-part of U.S. Serial Application No. 
09/032,684, filed February 27, 1998. The content of the 
above identified applications are hereby incorporated 
into this application by reference. 

Throughout this application, various references are 
referred to within parentheses. Disclosures of these 
publications in their entireties are hereby incorporated 
by reference into this application to more fully describe 
the state of the art to which this invention pertains. 



Rafflct yrounri of th* * Invention 

Changes in gene expression are important determinants of 
normal cellular physiology, including cell cycle 
regulation, differentiation and development, and they 
directly contribute to abnormal cellular physiology, 
including developmental anomalies, aberrant programs of 
differentiation and cancer (1-4) . In these contexts, 
the identification, cloning and characterization of 
differentially expressed genes will provide relevant and 
important insights into the molecular determinants of 
processes such as growth, development, aging, 
differentiation and cancer. A number of procedures can 
be used to identify and clone differentially expressed 
genes. These include, subtractive hybridization (5-10), 
differential RNA display (DDRT-PCR) (3,4, 11,12), RNA 
35 fingerprinting by arbitrarily primed PGR (RAP-PCR) 
(13,14), representational difference analysis (RDA) (15), 
serial analysis of gene expression (SAGE) (16,17), 
electronic subtraction (18,19) and combinatorial gene 
matrix analyses (20) . 
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Since first introduced by Liang and Pardee (11) , DDRT-PCR 
has gained wide popularity in analyzing and cloning 
differentially expressed genes. In DDRT-PCR, total RNAs 
or mRNAs from two or more cell types (or cells grown 
5 under different conditions, cells representing different 
stages of development, cells treated with agents 
modifying cellular physiology, etc.) are 

reverse-transcribed with two-base-pair anchored oligo dT 
primers , which divide rnRNA populations into 12 cDNA 

10 subgroups. Then, each cDNA subgroup is amplified by PCR 
with one of 2 0 arbitrary 10-mer 5 1 primers and a 3' 
anchored primer and the PCR-amplif ied cDNA fragments are 
resolved in DNA sequencing gels. The combinations of 
primers are designed not only to yield a detectable size 

15 and number of bands, but also to display nearly the 
complete repertoire of mRNA species. 

DDRT-PCR is a powerful methodology in which a vast number 
of mRNA species (>20,000, if no redundancy occurs) can be 

20 analyzed with only a small quantity of RNA (about 5 

fig) (11) . DDRT-PCR is often the method of choice when the 
RNA source is limiting, such as tissue biopsies. A 
direct advantage of DDRT-PCR is the ability to identify 
and isolate both up- and down- regulated differentially 

2 5 expressed genes in the same reaction. Furthermore, the 
DDRT-PCR technique permits the display of multiple 
samples in the same gel, which is useful in defining 
specific diagnostic alterations in RNA species and for 
temporally analyzing gene expression changes . However, 

30 the DDRT-PCR technique is not problem free. Difficulties 
encountered when using standard DDRT-PCR include, a high 
incidence of false positives and redundant gene 
identification, poor reproducibility, biased gene display 
and lack of functional information about the cloned cDNA. 

35 Furthermore, poor separation can mask differentially 
expressed genes of low abundance under the intense 
signals generated by highly expressed genes. The 
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generation of false positives and redundancy can be 
highly problematic, resulting in an inordinate 
expenditure of resources to confirm appropriate 
differential expression and uniqueness of the isolated 
5 cDNAs . The cDNAs must be isolated from the gels in pure 
form (contamination of bands with multiple sequences 
complicates clone identification) , reamplif ied, placed in 
an appropriate cloning vector, analyzed for authentic 
differential expression and finally sequenced. These 
10 limitations of the standard DDRT-PCR approaches emphasize 
the need for improvements in this procedure to more 
efficiently and selectively identify differentially 
expressed genes . 

15 A number of modifications and improvements of the 
DDRT-PCR approach have been described (21-23) . Single 
anchor or degenerate two -base anchor oligo dT primers can 
be used to streamline the massive numbers of reverse 
transcription and PCR reactions required for validation 

20 of cDNAs as well as to reduce false positives (24,25). 

Reproducibility can be improved by lengthening the 
arbitrary 5' primers to accommodate a convenient 
restriction site followed - by two cycles of PCR with 
successive low- and high- stringency annealing 

25 temperatures (25,26). DDRT-PCR with inosine-containing 
5' arbitrary primers can also increase reproducibility of 
this approach (27) . However, since these modifications 
have only been analyzed using a subset of primers, 
further studies are necessary to validate these 

30 modifications of DDRT-PCR with additional primers and in 
several model systems. 

In addition to genomic DNA contamination, mispriming, PCR 
artifacts, the high incidence of false positives and 
35 redundancy is also ascribed to poor separation between 
bands and the complexity of the templates amplified (28) . 
Furthermore, poor separation can mask differentially 
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expressed genes of low abundance under the intense 
signals generated by highly expressed genes. By 
enriching for unique cDNAs and removing common ones, it 
should in principle be possible to enrich for low 
5 abundant gene products and significantly decrease the 
complexity of amplified sequences. In addition, the 
sequence bias of DDRT-PCR should also be reduced by 
decreasing template complexity. These assumptions serve 
as the basis for the development of reciprocal 
10 subtraction differential RNA display (RSDD) . 



Subtractive hybridization, in which hybridization between 
tester and driver is followed by selective removal of 
common gene products, enriches for unique gene products 

15 in the tester cDNA population and reduces the abundance 
of common cDNAs (9) . A subtracted cDNA library can be 
analyzed to identify and clone differentially expressed 
genes by randomly picking colonies or by differential 
screening (29-31) . Although subtractive hybridization 

2 0 has been successfully used to clone a number of 
differentially expressed genes (5-7,10), this approach is 
both labor-intensive and does not result in isolation of 
the full spectrum of genes displaying altered expression 
(9, 18) . 

25 

In principle, DDRT-PCR performed with subtracted RNA or 
cDNA samples represents a powerful strategy to clone up 
and down-regulated gene products. This approach should 
result in the enrichment of unique sequences and a 

30 reduction or elimination of common sequences. This 
scheme should also result in a consistent reduction in 
band complexity on a display gel, thereby permitting a 
clearer separation of cDNAs resulting in fewer false 
positive reactions. Additionally, it should be possible 

35 to use fewer primer sets for reverse transcription and 
PCR reactions to analyze the complete spectrum of 
differentially expressed genes. Of particular importance 
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for gene identification and isolation, rare gene products 
that are masked by strong common gene products should be 
displayed by using subtraction hybridization in 
combination with DDRT-PCR . In addition, the DDRT-PCR 
5 approach with subtractive libraries could also prove 
valuable for efficiently screening subtracted cDNA 
libraries for differentially expressed genes. However, 
even though subtraction hybridization plus DDRT-PCR 
appears attractive for the reasons indicated above, a 
10 previous attempt to use this approach has proven of only 
marginal success in consistently reducing the complexity 
of the signals generated, compared with the standard 
DDRT-PCR scheme (32) . 

15 We presently describe a reciprocal subtraction 
differential RNA display (RSDD) approach that efficiently 
and consistently reduces the complexity of DDRT-PCR and 
results in the identification and cloning of genes 
displaying anticipated differential expression. 

20 
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Summary of the Invention 

This invention provides a method for identifying 
differentially expressed nucleic acids between two 
samples, comprising: (a) selecting a first and second 
5 nucleic acid sample, wherein the nucleic acid samples 
contain a repertoire of nucleic acids; (b) performing 
reciprocal subtraction between the nucleic acid samples 
to produce two subtracted nucleic acid samples; (c) 
amplifying the two subtracted nucleic acid samples; and 
10 (d) comparing the two subtracted nucleic acid samples to 
identify differentially expressed nucleic acids. 

This invention also provides a method for identifying 
differentially expressed nucleic acids between two 

15 samples, comprising: (a) selecting a first and second 
nucleic acid sample, wherein the nucleic acid samples 
contain a repertoire of nucleic acids; (b) amplifying the 
two nucleic acid samples; (c) performing reciprocal 
subtraction between the amplified nucleic acid samples to 

20 produce two subtracted nucleic acid samples; and (d) 
comparing the two subtracted nucleic acid samples to 
identify differentially expressed nucleic acids. 

This invention further provides the above-described 
25 methods, wherein the first and second nucleic acid 
samples are obtained from cells in different 
developmental stages. 

This invention further provides the above-described 
3 0 methods , wherein the first and second nucleic acid 
samples are obtained from cells from different tissue 
types. 

Also, this invention provides the above -described 
35 methods, wherein the 3 1 primer used in the PCR 
amplification is an oligo dT 3' primer. 



WO 99/43844 PCT/US99/04323 

-7- 

In addition, this invention provides the above -described 
methods, wherein the 3' primer used in the PCR 
amplification is a single anchor oligo dT 3' primer. 

5 This invention also provides the above-described methods, 
wherein the comparing of step (e) comprises using a gel 
to separate the nucleic acids from both of the libraries. 

This invention provides the isolated nucleic acid 
10 identified by the the above-described methods, wherein 
the nucleic acid was not previously known to be 
differentially expressed between the two samples. 
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Brief Description of the Figures 

Figure 1 

Identification of differentially expressed sequence tags 
5 using reciprocal subtraction differential RNA display 
(RSDD) . Left panel: differential RNA display pattern of 
conventional DDRT-PCR with RNA from Ell (C) and Ell-NMT 
(T) cells and an RSDD analysis of reciprocally subtracted 
Ell minus Ell-NMT (C/T) and Ell-NMT minus Ell (T/C) cDNA 
10 libraries. Right panel: representative RSDD patterns 
using different sets of primers. 

Figure 2 

Reverse Northern analysis of differentially expressed 
15 sequence tags identified by reciprocal subtraction 
differential RNA display (RSDD) . Differentially expressed 
sequence tags obtained from RSDD were dot -blotted onto 
Nylon membranes and probed with 32P-cDNA reverse 
transcribed from RNA samples of Ell and Ell-NMT cells. 

20 

Figure 3A 

Differential expression of representative progression 
elevated genes (PEGen) and progression suppressed genes 
(PSGen) identified by reciprocal subtraction differential 

2 5 RNA display (RSDD) and reverse Northern blotting. 

Northern blots of Ell and Ell-NMT RNA samples were probed 
with radiolabeled ( 32 P) expressed sequence tags identified 
by RSDD and reverse Northern blotting. 

3 0 Figure 3B 

Differential expression of representative progression 
elevated genes (PEGen) and progression suppressed genes 
(PSGen) identified by reciprocal subtraction differential 
RNA display (RSDD) and reverse Northern blotting. 

35 

Figure 4 

Differential expression of representative progression 



WO 99/43844 PCT/US99/04323 

-9- 

elevated genes (PEGen) and progression suppressed genes 
(PSGen) identified by reciprocal subtraction differential 
RNA display (RSDD) and reverse Northern blotting. 
Northern blots of cells displaying various stages of 
5 transformation progression were probed with radiolabeled 
( 32 P) expressed sequence tags identified by RSDD and 
reverse Northern blotting. The cell types used include, 
Tlnproaressed Ell {-), CREFxEll -NMT Fl (-) and 
CREFxE 1 1 - NMT F2 (-) somatic cell hybrids, EllxEll-NMT A6 

10 (-) somatic cell hybrid, EllxEll-NMT 3b (-) somatic cell 

hybrid, and Ell -NMT Aza Bl (-) and Ell -NMT Aza CI (-) 
5-azacytidine treated Ell-NMT clones; and Progressed 
Ell -NMT ( + ) , CREFxEll -NMT Rl (+) and CREFxEll -NMT R2 (+) 
somatic cell hybrids, EllxEll-NMT A6TD ( + ) nude mouse 

15 tumor derived somatic cell hybrid, EllxEll-NMT Ha ( + ) , 
Ell-Ras R12 ( + ) a Ha-ras transformed Ell clone and 
Ell-HPV E6/E7 ( + ) an Ell clone transformed by the E6 and 
E7 region of HPV-18. 

20 Fi cm -re 5 

cDNA fragment of PEGen 7 - 90% Homology to Human HPV16 
E1BP. (Sequence ID No. 1) 

Figure 6 

25 cDNA fragment of PEGen 8 - Rat phosphof ructose kinase C. 
(Sequence ID No. 2) 

Figure 7 

First (Sequence ID No. 3) and second (Sequence ID No. 4) 
30 cDNA fragments, of PEGen 13. 

Figure 8 

cDNA fragment of PEGen 14. (Sequence ID No. 5) 



35 



Figure 9 

cDNA fragment of PEGen 15. (Sequence ID No. 6) 
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Figure 10 

cDNA fragment of PEGen 21 which has 94% homology to mouse 
FIN 14. (Sequence ID No. 7) 

5 Figure 11 

cDNA fragment of PEGen 24. (Sequence ID No. 8) 

Figure 12 

cDNA fragment of PEGen 26 - Rat poly ADP-ribose 
10 polymerase. (Sequence ID No. 9) 

Figure 13 

cDNA fragment of PEGen 28. (Sequence ID No. 10) 

15 Figure 14 

cDNA fragment of PEGen 42. (Sequence ID No. 11) 

Figure 15 

cDNA fragment of PEGen 43. (Sequence ID No. 12) 

20 

Figure 16 

cDNA fragment of PEGen 44. (Sequence ID No. 13) 
Figure 17 

25 cDNA fragment of PEGen 48. (Sequence ID No. 14) 
Figure 18 

cDNA fragment of PSGen 1 which has 80% homology to B. 
taurus supervillin. (Sequence ID No. 15) 

30 

Figure 19 

cDNA fragment of PSGen 2 which has 91% homology to human 
HTTjV-1 Tax interacting protein. (Sequence ID No. 16) 

35 Figure 20 

cDNA fragment of PSGen 4 - Rat proteasome activator. 
(Sequence ID No. 17) 



♦ 
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Figure 21 



cDNA fragment of PSGen 10 - Rat Ferritin Heavy Chain. 
(Sequence ID No. 18) 



cDNA fragment of PSGen 12. (Sequence ID No. 19) 
Figure 23 

cDNA fragment of PSGen 13. (Sequence ID No. 20) 



Fi gure 24 

cDNA fragment of PSGen 23. (Sequence ID No. 21) 
Figure 2 5 

15 cDNA fragment of PSGen 24. (Sequence ID No. 22) 
Figure 2 6 

cDNA fragment of PSGen 25. (Sequence ID No. 23) 

20 Figure 2 7 

cDNA fragment of PSGen 26. 

Figure 2 8 

cDNA fragment of PSGen 27. 



5 



Fi gure 22 



10 



25 



Figure 2 9 

cDNA fragment of PSGen 28 . 



30 



Figure 3 0 

cDNA fragment of PSGen 29. 



Figure 31 

cDNA fragment of PEGen 32. 



35 
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Figure 32 

Schematic outline of the reciprocal differential RNA 
display (RSDD) protocol. This scheme incorporates three 
steps, reciprocal subtraction of cDNA libraries, 
5 differential display of in vivo excised cDNAs and 
expression analysis by reverse Northern and standard 
Northern blotting. For the present application of RSDD, 
reciprocal subtraction hybridization was performed using 
libraries constructed from Ell and Ell-NMT cells, i.e., 
10 Ell minus Ell-NMT and Ell-NMT minus Ell. Differentially 
expressed cDNAs identified on gels using differential RNA 
were isolated, reamplified and analyzed for expression by 
reverse Northern blotting. To confirm differential 
expression cDNAs were analyzed using Northern blotting. 

15 

Figure 33 

Differential expression of representative progression 
elevated (PEGen) and progression suppressed genes (PSGen) 
identified by RSDD and reverse Northern blotting. 

2 0 Northern blots of Ell and Ell-NMT RNA samples were probed 

with radiolabeled ( 32 P) expressed sequence tags identified 
by RSDD and reverse Northern blotting. Equal loading of 
Ell and Ell-NMT RNA is demonstrated by ethidium bromide 
(EtBr) Staining . 

25 

Figure 34 

Differential expression of representative PEGen and PSGen 
genes identified by RSDD and reverse Northern blotting in 
a large panel of rodent cells displaying differences in 

3 0 transformation progression. Northern blots of cells 

displaying various stages of transformation progression 
were probed with radiolabeled ( 32 P) expressed sequence 
tags identified by RSDD and reverse Northern blotting. 
The cell types used include: Unprogressed Ell (-), CREF 
35 X Ell-NMT Fl (-) and CREF X Ell-NMT F2 (-) somatic cell 
hybrids, Ell X Ell-NMT A6 (-) somatic cell hybrid, Ell X 
Ell-NMT 3b (-) somatic cell hybrid, and Ell-NMT AZA Bl 
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(-) and Bll-NMT AZA CI (-) 5 -azacytidine- treated Ell-NMT 
clones; and Progressed Ell-NMT ( + ) , CREF X Ell-NMT Rl ( + ) 
and CREF X Ell-NMT R2 ( + > somatic cell hybrids, Ell X 
Ell-NMT A6TD (+) nude mouse tumor derived somatic cell 
5 hybrid, Ell X Ell-NMT Ha ( + > , Ell-Ras R12 < + > and 
Ell-HPV E6/E7 <*> an Ell clone transformed by the E6 and 
E7 region of HPV-18. Equal loading of RNAs is 
demonstrated by ethidium bromide (EtBr) staining. 

10 P-ignr-p 35 A 

PSGen 12 cDNA Seqeunce and PSGen 12 Protein Sequence 

Fiaur ^ 3 5 B 

PSGen 13 cDNA Sequence and PSGen 13 Protein Sequence 

15 

Fiaur P 3 5 C 

PEGen 28 cDNA Sequence and PEGen 28 Protein Sequence 

T7-iqiiT-^ 3 5D 

20 PEGen 32 cDNA Sequence and PEGen 32 Protein Sequence 

p-i qilT-^ 35 E 

PEGen 42 cDNA Sequence and PEGen 42 Protein Sequence 

25 Figure 3 5 F 

PEGen 45 cDNA Sequence 

c-i gM-rP 3 5 rz-i and Figure 35 G-2 

PEGen 50 cDNA Sequence which are different parts of the 

30 gene. 



v-irpi-re 36 

PSGen 27 - Novel 
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Detailed Desc ription of the Inv^nhi on 

This invention provides a method for identifying 
differentially expressed nucleic acids between two 
samples, comprising: (a) selecting a first and second 
5 nucleic acid sample, wherein the nucleic acid samples 
contain a repertoire of nucleic acids; (b) performing 
reciprocal subtraction between the nucleic acid samples 
to produce two subtracted nucleic acid samples; (c) 
amplifying the two subtracted nucleic acid samples; and 
.0 (d) comparing the two subtracted nucleic acid samples to 

identify differentially expressed nucleic acids. 



In an embodiment, the nucleic acid samples are mRNA or 
derived from mRNA. In another embodiment, the nucleic 
acid samples are total RNA. In another embodiment, the 
nucleic acid samples are cDNA. In another embodiment, 
the nucleic acid samples are a nucleic acid library. 

In an embodiment, differentially expressed nucleic acids 
are expressed at different levels. In a further 

embodiment, one of the nucleic acids is not expressed. 
In a different embodiment, one of the nucleic acids is 
expressed in truncated form. 

As used herein, reciprocal subtraction includes using 
nucleic acid sample A to subtract common nucleic acids 
from nucleic acid sample B (based on hybridization) and 
also using nucleic acid sample B to subtract common 
nucleic acids from nucleic sample A. In an embodiment, 
the complement of nucleic acid sample A is used to 
subtract nucleic acids from nucleic acid sample B and the 
complement of nucleic acid sample B is used to subtract 
nucleic acids from nucleic acid sample A. In a further 
embodiment, the RNA of nucleic acid sample A is used to 
subtract nucleic acids from nucleic acid sample B and the 
RNA of nucleic acid sample B is used to subtract nucleic 
acids from nucleic acid sample A. In yet another 
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embodiment, the cDNA of nucleic acid sample A is used to 
subtract nucleic acids from nucleic acid sample B and the 
cDNA of nucleic acid sample B is used to subtract nucleic 
acids from nucleic acid sample A. 

5 

As used herein, methods of amplification include PCR and 
rolling circle replication. 



10 



15 



20 



A basic description of nucleic acid amplification is 
described in Mullis, U.S. Patent No. 4,683,202, which is 
incorporated herein by reference. The amplification 
reaction uses a template nucleic acid contained in a 
sample, two primer sequences and inducing agents. The 
extension product of one primer when hybridized to the 
second primer becomes a template for the production of a 
complementary extension product and vice versa, and the 
process is repeated as often as is necessary to produce 
a detectable amount of the sequence. 

The inducing agent may be any compound or system which 
will function to accomplish the synthesis of primer 
extension products, including enzymes. Suitable enzymes 
for this purpose include, for example, E.coli DNA 
polymerase I, thermostable Tag DNA polymerase, Klenow 
fragment of E.coli DNA polymerase I, T4 DNA polymerase, 
other available DNA polymerases, reverse transcriptase 
and other enzymes which will facilitate combination of 
the nucleotides in the proper manner to form 
amplification products. The oligonucleotide primers can 
30 be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially prepared based 
upon the nucleic acid sequence of this invention. 

This invention also provides a method for identifying 
35 differentially expressed nucleic acids between two 
samples, comprising: a) selecting a first and second 
nucleic acid sample; b) producing libraries for the first 
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and second nucleic acid sample; c) amplifying the two 
libraries; d) performing reciprocal subtraction between 
the amplified libraries to produce two subtracted 
libraries; and e) comparing the two subtracted libraries 
5 to identify differentially expressed nucleic acids. 



10 



This invention also provides a method for identifying 
differentially expressed nucleic acids between two 
samples, comprising: (a) selecting a first and second 
nucleic acid sample, wherein the nucleic acid samples 
contain a repertoire of nucleic acids; (b) amplifying the 
two nucleic acid samples; (c) performing reciprocal 
subtraction between the amplified nucleic acid samples to 
produce two subtracted nucleic acid samples; and (d) 
15 comparing the two subtracted nucleic acid samples to 
identify differentially expressed nucleic acids. 

This invention also provides the above-described methods, 
wherein the two subtracted nucleic acid samples from step 
c are amplified prior to the comparing of step d. 



20 



25 



This invention also provides the above-described methods, 
wherein the each of the nucleic acid samples comprises a 
library of nucleic acids. 

This invention also provides the above-described methods, 
wherein the nucleic acid samples are obtained from total 
cellular RNA purified by hybridization with oligo (dT) . 

3 0 This invention also provides the above-described methods, 
wherein the nucleic acid samples are obtained from total 
RNA from Ell and Ell-NMT cells. 

Ell is an adenovirus- transformed rat embryo cell line 
3 5 that acquires an aggressive oncogenic progression 
phenotype when injected into athymic nude mice and 
reisolated in cell culture (Ell-NMT) . 
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This invention further provides the above -described 
methods, wherein the first and second nucleic add 
samples are obtained from cells in different 
developmental stages . 

5 This invention further provides the above -described 
methods, wherein the first and second nucleic add 
samples are obtained from cells from different tissue 



10 



30 



35 



types 



This invention further provides the above -described 
methods, wherein the first and second nucleic add 
samples are obtained from cells that differ in their 
exposure to external factors or in their gene expression. 

in an embodiment, cells that differ in their exposure to 
external factors or in their gene expression includes any 
cells that may have different levels of gene expression, 
wherein some genes may not be expressed at all. In 

20 another embodiment, cells that differ in their exposure 
to external factors or in their gene expression includes 
any cells that are likely to have different levels of 
gene expression, wherein some genes may not be expressed 
at all in still another embodiment, cells that differ 

25 in their exposure to external factors or in their gene 
expression includes any cell that has a phenotypically 
recognizable difference. 

A short list of examples of cells that differ in their 
exposure to external factors or in their gene expression 
includes: cancerous versus normal cells, advanced cancer 
progression cells versus ealier cancer stage cells, 
diseased cells versus nondiseased cells, infected cells 
versus noninfected cells, later developmental stage cells 
versus earlier developmental stage cells, cells after DNA 
damage versus cells before DNA damage, senescent cells 
versus younger cells, cells induced by growth factors 
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versus cells not induced by growth factors, cells in the 
process of neurodegeneration versus normal cells, and 
cells exposed to a chemotherapeutic agent versus normal 
cells . 

5 

As used herein, different tissues types include but are 
not limited to tissues containing: cells grown under or 
exposed to different conditions, cells in different 
stages of development, cells treated with agents 
modifying cellular physiology, and cells having different 
functions . 



10 



In an embodiment, cells at different stages of 
development are cells taken or analyzed at times 
15 differing by one or more hours in the development of the 
cell or organism. 

Further, this invention provides the above -described 
methods, wherein the amplifying of step (d) comprises PCR 

2 0 amplification. 

Also, this invention provides the above -described 
methods, wherein the 3' primer used in the PCR 
amplification is an oligo dT 3' primer. A few examples 
25 of oligo dT primers are T 13/ T 13 A, and T 13 GA. 

In addition, this invention provides the above -described 
methods, wherein the 3- primer used in the PCR 
amplification is a single anchor oligo dT 3' primer. 

3 0 Olgio dT 3" primers include T 13 A, T 13 C, and T 13 G. 

This invention provides the above-described methods, 
wherein the PCR amplification uses a set of random 
primers . 



35 



This invention provides the above -described methods, 
wherein the 5* primer is an arbitrary primer. 
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This invention also provides the above-described methods, 
wherein the comparing of step (e) comprises using a gel 
to separate the nucleic acids from both of the 
substracted libraries. 

5 

In an embodiment, the gel is a polyacrylamide gel. In 
another embodiment, the gel is an agarose gel. 
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This invention further provides the above-described 
methods, further comprising PCR amplifying the first and 
second nucleic acid samples. 



This invention also provides the above -described methods, 
further comprising reamplifying differentially expressed 
15 bands. 

This invention also provides the above -described methods, 
further comprising reamplifying differentially expressed 
nucleic acid. 

20 

In one method of reamplifying differentially expressed 
bands, differentially amplified bands from plasmids of 
each subtracted library were marked with an 18G needle 
through the film and cut out with a razor. The cut out 
25 differentially expressed bands can be reamplified (i.e. 

by PCR) and examined by reverse Northern and Northern 
blot analyses. 

in addition, this invention provides the above -de scribed 
30 methods, wherein the comparing of step (e) comprises 
comparing the band intensities of the two amplified 
differentially expressed nucleic acids. 

In addition, this invention provides the above-described 
35 methods, wherein the nucleic acid samples are mRNA or 
cDNA derived from mRNA. 
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In addition, this invention provides the above-described 
methods, wherein the comparing of step (e) comprises 
comparing the quantities of the two amplified 
differentially expressed nucleic acids. 

5 

This invention further provides the above -described 
methods, wherein the differences in band intensity 
between the two subtracted libraries are electronically 
quantified. 

10 

This invention further provides the above-described 
methods, wherein the differences in the quantities of 
nucleic acid between the two subtracted libraries are 
electronically quantified. 

15 

In one embodiment, electronic quantification involves 
using a scanner to detect the bands . In a further 
embodiment, computer software, such as Corel Draw, can be 
used to determine the pixel intensity of the scanned 
2 0 image, thereby quantifying the band intensity. 

Also, this invention provides the above-described 
methods, wherein the libraries of step (b) are 
constructed with A-ZAP cDNA library kits. One skilled in 

2 5 the art would recognize that any cDNA library would be 

suitable. 

This invention provides the isolated nucleic acid 
identified by the the above-described methods, wherein 

3 0 the nucleic acid was not previously known. 

This invention also provides the above-described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 12 (AI 144569) . 

35 

In addition, this invention provides the above -described 
isolated nucleic acid, wherein the isolated nucleic acid 
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is the nucleic acid designated PSGen 13 (Accession No. AI 
144570) . 

This invention provides the above -described isolated 
5 nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 23 . 

This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
10 nucleic acid designated PSGen 24. 

This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 25. 

15 

This invention provides the above -de scribed isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen - 26 (Accession No. AI 

144571) . 

This invention also provides the above-described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 27 (Accession No. AI 

144572) . 

i 

This invention provides the above-described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 28 (AI 144573) . 

This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PSGen 29 (AI 144574) . 

This invention provides the above -described isolated 
35 nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 13 (AI 144564) . 



20 
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This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 14 (AI 144565) . 

5 This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 15. 

This invention provides the above -described isolated 
10 nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 24 (Accession No. AI 
144566) . 

This invention provides the above -described isolated 
15 nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 28 (AI 144567) . 

This invention provides the above-described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
20 nucleic acid designated PEGen 32 (AI 144568) . 

This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 42. 

25 

This invention provides the above -described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 43. 

3 0 This invention provides the above-described isolated 
nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 44. 

This invention provides the above -de scribed isolated 
35 nucleic acid, wherein the isolated nucleic acid is the 
nucleic acid designated PEGen 48. 
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This invention further provides a previously unknown 
isolated nucleic acid molecule identified by the above- 
described methods which comprises (a) one of the nucleic 
acid sequences as set forth in Figure 35; (b) a sequence 
5 being degenerated to a sequence of (a) as a result of the 
genetic code; (O a sequence encoding one of the amino 
acid sequences as set forth in Figure 35. (d) a sequence 
of at least 12 nucleotides capable of specifically 
hybridizing to the sequence of (a) , (b) or (c) - 
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15 



20 



Finally, this invention provides a purified polypeptide 
comprising one of the amino acid sequence as set forth in 
Figure 35. 

The sequences of the cDNA of PSGen 12, PSGen 13, PSGen 
26, PSGen 27, PSGen 28, PSGen 29, PEGen 13, PEGen 14, 
PEGen 24, PEGen 28, and PEGen 32 were submitted to 
GenBank and assigned with accession numbers AI 144569, AI 
144570, AI 144571, AI 144572, AI 144573, AI 144574, AI 
144564, AI 144565, AI 144566, AI 144567 and AI 144568 
respectively. 

This invention will be better understood from the 
Experimental Details which follow. However, one skilled 
in the art will readily appreciate that the specific 
methods and results discussed are merely illustrative of 
the invention as described more fully in the claims which 
follow thereafter. 

3 0 Kx per-ir n^rttal Details 

We presently describe a reciprocal subtraction 
differential RNA display (RSDD) approach that efficiently 
and consistently reduces the complexity of DDRT-PCR and 
results in the identification and cloning of genes 
displaying anticipated differential expression. Proof of 
principle for the RSDD approach has come from its 
application for the identification of genes 
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dif f erentially expressed during cancer progression. RSDD 
has resulted in the identification and cloning of genes 
displaying elevated expression in progressed tumor cells 
(PEGen) and reduced expression in progressed tumor cells 
5 (PSGen) . The model used for RSDD was an 

adenovirus -transformed rat embryo cell line, Ell, that 
acquires an aggressive oncogenic progression phenotype 
when injected into athymic nude mice and reisolated in 
cell culture (Ell-NMT) (10,33,34). Injection of Ell 

10 cells into nude mice results in tumors in 100% of animals 
with a tumor latency time of approximately 35 to 4 0 days, 
whereas Ell-NMT cells form tumors in 100% of nude mice 
with a tumor latency time of 15 to 20 days (10,34,35). 
Additionally, Ell cells form colonies in agar with an 

15 efficiency of ~3%, whereas Ell-NMT display an agar 
cloning efficiency of >30% (10,33,34). The increased 
tumorigenicity and enhanced anchorage independence 
phenotypes are key indicators of tumor progression in the 
E11/E11-NMT model system (10,33,34) . 

20 

Differential RNA display was directly performed with 
reciprocally subtracted cDNA plasmid libraries (Ell minus 
Ell-NMT and Ell-NMT minus Ell) . Compared with the 
subtraction of PCR-amplif ied cDNA in Hakvoort et al . , the 

25 subtracted cDNA libraries used in this experiment are 
free from potential PCR artifacts and provide more stable 
and consistent sources for DDRT-PCR analyzes. In 
addition, three single anchored oligo dT 3 1 primers were 
used instead of two-base-anchored approach described by 

30 Hakvoort et al (32) . To further streamline the DDRT-PCR 
procedure, reamplified cDNAs identified using RSDD were 
analyzed using the reverse Northern blotting procedure 
(35,36). cDNAs displaying differential expression by 
reverse Northern blotting were subsequently confirmed for 

35 true differential expression by Northern analysis. These 
modifications incorporated in the RSDD strategy result in 
an efficient approach for using subtractive hybridization 
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and DDRT-PCR for identifying differentially expressed 
genes . 

Methods 

5 Total RNA from Ell and Ell-NMT cells was isolated by the 
guanidinium isothiocyanate/CsCl centrif ugation procedure 
and poly A + RNA was purified with oligo(dT) cellulose 
chromatography (5) . Two X-ZAP cDNA libraries from Ell and 
Ell-NMT mRNA's were constructed with X-ZAP cDNA library 
Kits (Stratagene) following the manufacturer's protocol. 
Reciprocal subtraction between Ell and Ell-NMT libraries 
was performed and two subtracted cDNA libraries (Ell 
minus Ell-NMT and Ell-NMT minus Ell) were constructed as 
described previously. Bacterial plasmid libraries from 
the subtracted X-ZAP cDNA libraries were obtained by in 
vivo excision following the manufacturer's protocol 
(Stratagene) and the plasmids were isolated with Qiagen 
columns (Qiagen Inc .) . 
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20 



The purified plasmids of reciprocally subtracted cDNA 
libraries were directly subjected to differential display 
as in Liang et . al . (38) with minor modifications. The 
plasmids of reciprocally subtracted cDNA libraries were 
PCR-amplified with the combination of three single-anchor 
25 3' primers (T 13 A, T „ C or T 13 G) and 18 arbitrary 5' 
10-mer primers obtained from Operon Technology Inc. 
(Alameda, CA. OPA 1-20 except 0PA1 and 3) . The 20 fxl PCR 
reaction consisted of 10 mM Tris-HCl pH 8.4, 50 mM KCl , 
1.5 mM MgCl 2 , 2 fiM each dNTP , 0.2 fiM 5' arbitrary primer, 
30 1 MM 3' anchor primer, 50 ng of plasmid of a subtracted 
library, 10 fiCi a- 35 S-dATP (3000 Ci/mmole from Amersham) 
and 1 U of Taq DNA polymerase (Gibco BRL) . The 
parameters of PCR were 30 sec at 95 C, 40 cycle of 3 0 sec 
at 95 C, 2 min. at 40 C and 30 sec at 72 C and additional 
3 5 5 min. at 72 C. After the cycling, 10 pi of 95% 
formamide, 0.05% bromophenol blue and 0.05% xylene cyanol 
were added to each PCR reaction. The mixture was heated 
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at 95 °C for 2 min and separated in a 5% denaturing DNA 
sequencing gel maintained at 50 °C. PCR reactions of 
plasmids from each subtracted library in a primer set 
were run side by side. Differentially amplified bands 
5 from plasmids of each subtracted library were marked with 
an 18G needle through the film and cut out with a razor. 
The gel slice was put in 100 /zl TE pH 8.0 and incubated 
at 4 °C overnight. After the incubation, the mixture was 
boiled for 5 min and microcentrif uged for two min. The 

10 supernatant was collected and stored at -20 °C until 
reamplif ication . The band extract was reamplified with 
the same cycling parameters in a 5 0 /xl reaction 
consisting of 10 mM Tris-HCl pH 8.4, 50 mM KC1 , 1.5 mM 
MgCl 2 , 20 fiM each dNTP, 0.2 /xM 5' arbitrary primer, 1 /iM 

15 3' anchor primer, 5 /xl of band extract and 2 . 5 U of Taq 
DNA polymerase (Gibco BRL) . 

Differential expression of the reamplified DNA fragment 
was scrutinized by reverse Northern and Northern blot 
20 analyses. In reverse Northern analysis, after 

confirmation in a 1% agarose gel, the reamplified DNA 
fragment (10 /xl of PCR reaction) was mixed with 90 /xl TE 
and spotted on a positively charged Nylon membrane 
(Boehringer Mannheim) with a 96 -well vacuum manifold. 

2 5 The membrane was soaked with denaturing and neutralizing 

solution successively, and the spotted DNA was 
crosslinked to the membrane with a UV crosslinker 
(Stratagene) . 32 P-labeled first strand cDNA was prepared 
by reverse transcription of total RNA. After heating at 

3 0 70 °C for 10 min and quenching on ice for two min, 0.4 /xM 

each T 13 A, T 13 G and T 13 C and 10 fig total. RNA mixture 
was added with 5 0 mM Tris-HCl, pH 8 . 3 , 75 mM KC1 , 3 mM 
MgCl2, 10 mM DTT, 0 . 5 mM dATP, 0 . 5 mM dGTP, 0 . 5 mM dTTP, 
0.02 mM dCTP, 0 . 5 /il RNase inhibitor (Gibco BRL), 100 /xCi 
35 dCTP (3000 Ci/mmole from Amersham) and 200 U Superscript 
RT II (Gibco BRL) in a final 25 /xl reaction. The 
reaction mixture was incubated at 42 °C for one hr and at 
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37 °C for 3 0 min after addition of 2 fxl of RNase H (10U, 
Gibco BRL) . The membrane was hybridized at 4 2 °C 
overnight in a 50% formamide hybridization solution. The 
hybridized membrane was washed at room temperature for 15 
5 min with 2X SSC containing 0.1% SDS twice and at 55 °C for 
at least one hr with 0 . IX SSC containing 0.1% SDS, 
successively. The membrane was probed with the 

32 P-labeled cDNA of Ell, stripped off and probed with 
32 P- labeled cDNA of Ell -NMT. The signal intensity of each 
10 spot was normalized against that of GAPDH and compared 
between Ell and Ell -NMT . Reamplified DNA fragments 
displaying differential expression levels 21. 8-fold 
higher between the two cell types were selected and 
analyzed by Northern blotting analysis. 

15 

In Northern blot analysis, 10 fxg of total RNA from Ell 
and Ell-NMT cells were run side-by-side in a 1% agarose 
gel with formaldehyde and transferred to a positively 
charged Nylon membrane. Reamplif ication reaction (5 fil) 
20 was 32 P-labeled with a multiprime labeling kit (Boehringer 
Mannheim) used to probe the membrane as described above. 
DNA fragments expressed differentially between Ell and 
Ell-NMT in Northern blot analyses were cloned into the 
Eco RV site of the pZEro-2 .1 cloning vector ( Invitrogene) 

2 5 and sequenced. In order to confirm differential 

expression, the cloned cDNA fragment was released by Eco 
RI -Xho I, 32 P- labeled and used to probe Northern blots as 
described above. Samples of RNAs from various Ell and 
Ell-NMT derivatives displaying either a progressed or 

3 0 suppressed progression phenotype, based on nude mice 

tumorigenesis and soft agar cloning assays were analyzed. 
These included Ell, Ell-NMT, CREF X Ell-NMT Fl and F2 
somatic cell hybrids (suppressed progression phenotype) , 
CREF X Ell-NMT Rl and R2 somatic cell hybrids 
3 5 (progression phenotype) , Ell X Ell-NMT A6 somatic cell 

hybrid (suppressed progression phenotype) , Ell X Ell-NMT 
A6TD tumor-derived somatic cell hybrid (progression 
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phenotype) , Ell X Ell-NMT 3b somatic cell hybrid 
(suppressed progression phenotype) , Ell X Ell-NMT 2a 
(progression phenotype) , Ell-NMT AZA Bl and CI 
5-azacytidine treated Ell-NMT clones (suppressed 
5 progression phenotype) , Ell-ras R12 clone containing the 
Ha-ras oncogene (progression phenotype) and Ell-HPV E6/E7 
clone containing the human papilloma virus-18 E6 and E7 
gene region (progression phenotype) . Differential 
expression of the PEGen and PSGen genes in the various 
10 cell types was confirmed using 32 P-labeled probes and 
Northern hybridization analysis. After reconfirmation of 
differential expression, the plasmids containing the 
differentially expressed DNA fragments were sequenced by 
the dideoxy sequencing procedure. 

15 

Results and Discussion 

Subtraction hybridization provides a direct means of 
enriching for unique cDNA species and eliminating common 

2 0 sequences between complex genomes. DDRT-PCR is a proven 

methodology for the rapid identification and cloning of 
differentially expressed sequences between cell types 

(3,4,22). In principle, subtraction hybridization 

combined with DDRT-PCR should reduce band complexity 
25 which often obscures the identification of differentially 
expressed genes and generates false positive signals 

(23,28). This strategy, RSDD, has been used to analyze 
genes differentially expressed during transformation 
progression. The differential RNA display pattern of Ell 

3 0 and Ell-NMT cells using standard differential RNA display 

DDRT-PCR) and RSDD is shown in Fig. 1. (Left Panel) . As 
predicted, the differential RNA display pattern of RSDD 
was much less complex than that of DDRT-PCR. The 
majority of bands common to both cDNA samples were 
3 5 eliminated using RSDD. These experiments demonstrate 
that subtractive hybridization prior to differential RNA 
display is effective in simplifying display patterns 
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permits the efficient identification of differentially 
expressed cDNAs . Since RSDD significantly reduced the 
number of bands displayed, single anchor oligo dT 
primers, that can increase band numbers, were 
5 successfully used in subsequent applications of the RSDD 
approach (Fig- 1;. Right. Panel). Using RSDD, 235 
differentially displayed cDNAs in the Ell/Ell-NMT tumor 
progression model system were isolated. 

10 Hakvoort et . al . (32) used a reciprocal subtraction 
approach to analyze gene expression changes resulting 
during liver regeneration following 70% hepatectomy, 
i.e., normal liver subtracted from partially 
hepatectomized regenerating liver and vice versa. 
15 Although some bands displayed apparent enrichment, the 
complexity of the display pattern did not show 
appreciable simplification. These results are in stark 
contrast to RSDD, which results in a clear delineation 
and simplification of differentially expressed amplified 
20 bands (Figs. 1). Although conceptually similar, RSDD is 
significantly more effective than the subtraction plus 
DDRT-PCR approach described by Hakvoort et al . (32) . The 
improved efficiency of RSDD versus the Hakvoort et al . 
(32) approach can be attributed to several factors. The 
25 approach of Hakvoort et al . (32) is based on the 
subtraction procedure described by Wang and Brown (38) . 
This approach involves multiple rounds of 
PCR-amplification prior to each round of subtractive 
hybridization. In contrast, RSDD involves a single round 
3 0 of reciprocal subtraction that does not involve PCR 
amplification (5,10). In this respect, the complicated 
display pattern observed by Hakvoort et al . (32) even 
after three or four rounds of subtraction might result 
from reduced subtraction efficiency, PCR artifacts or a 
35 combination of these problems. Increasing the number of 
reactions by using two-base pair anchored oligo dT 
primers did not reduce the complexity of displayed bands 
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(32) . In these contexts, a critical component for the 
successful use of RSDD involves the use of an appropriate 
subtraction hybridization protocol, that can efficiently 
reduce cDNA complexity and generate stable populations of 
5 cDNAs for analysis . 

Previous studies demonstrate that different gene cloning 
strategies, including DDRT-PCR, subtraction hybridization 
and electronic display, identify dissimilar 

10 differentially expressed genes (18) . These results 
suggest that a single approach for gene identification 
may not identify the complete spectrum of differentially 
expressed genes (18) . Similarly, RSDD and DDRT-PCR do 
not resolve the same differentially expressed bands (Fig. 

15 l) . Unique bands identified in DDRT-PCR that were 
differentially expressed when analyzed by Northern 
blotting were not the same as those found using RSDD and 
vise versa. These results are not surprising, since, as 
indicated above, subtraction hybridization and 

20 differential RNA display identified distinct 
differentially expressed genes. Apparently, specific 
differentially expressed genes are lost during 
subtraction hybridization and differential RNA display of 
subtracted cDNAs . On the basis of these considerations, 

25 it will be essential to use multiple gene discovery 
approaches to identify and clone the complete spectrum of 
differentially expressed genes. 

DDRT-PCR can generate large numbers of differentially 
3 0 displayed bands making subsequent analysis both labor 
intensive and a daunting challenge. In order to reduce 
these limitations of DDRT-PCR, RSDD has been used in 
combination with reverse Northern analyses of isolated 
cDNAs. Gel extracted cDNA fragments were reamplified, 
3 5 dot -blotted on Nylon membranes and successively probed 
with reverse transcribed 32 P-cDNA from Ell or Ell-NMT RNAs 
(Fig. 2) . Signals were detected in 181 reamplified bands 
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out of 235 (77%) . This number is lower than that 
observed using DDRT-PCR (51 out of 54) . However, this 
comparison may not be accurate since only four arbitrary 
primers were used for DDRT-PCR and fewer differentially 
5 expressed bands were detected and isolated. A possible 
reason for the high incidence of false positives in RSDD 
may be due to the existence of foreign plasmid-like DNA 
in the cDNAs and the inaccurate reading properties of 
DDRT-PCR. 
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Table l. Differentially Expressed cDNA Fragments 
Cloned by DDRT-PCR. 



Nomenclature Identity 



Homology 



PEGen 41 



To be determined 



PEGen 4 2 



Novel 



Novel 



10 



PEGen 4 3 



Novel 



Novel 



PEGen 4 4 



Novel 



Novel 



15 



PEGen 4 5 



PEGen 4 6 



Hoxall locus ant i sense 



mouse 90 s 



Glutamyl t-RNA synthetase human 59 ' 



PEGen 4 8 



Novel 



Novel 



PEGen 5 0 



Novel 



Novel 



20 



25 



PSGen 1 



PSGen 2 



PSGen 4 



PSGen 2 7 



Supervillin 

HTLV-1 Tax interacting 
protein 

Proteasome activator 
Novel 



B. 

taums 80% 
human 91% 

Rat 100% 
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The signal intensities of the various cDNAs in reverse 
Northern analysis were quantified and normalized against 
that of GAPDH, which remained unchanged in Ell and 
Ell-NMT cells. The PEG-3 (PEGen-3) gene (10) was used as 
an additional control, to verify increased expression in 
Ell-NMT versus Ell cells . In the reverse Northern 
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analyses, PEGen-3 levels were 4-fold higher in Ell-NMT 
than in Ell cells, which coincided with Northern blotting 
results, thereby demonstrating the concordance of reverse 
Northern and Northern assays. A > 1.8-fold differential 
5 cut-off (after normalization for GAPDH expression) was 
used to identify and isolate cDNA bands displaying 
modified expression in Ell versus Ell-NMT cells. This 
resulted in the identification of 7 cDNAs with higher 
expression in Ell versus Ell-NMT cells and 65 cDNAs with 
10 elevated expression in. Ell-NMT versus Ell cells. These 
results suggest that tumor progression in Ell-NMT cells 
correlates with the increased expression of a large 
number of genes, whereas only a smaller subset of genes 
display decreased expression. 

15 

A problem present in DDRT-PCR, that is reduced but still 
can occur in RSDD, is the isolation of multiple cDNA 
species from what appears to be a single amplified band. 
When this occurs, these multiple species can produce 

2 0 spurious results when analyzed by reverse Northern 

analyses. For example, if two distinct species are 
isolated, one displaying modified expression and a second 
not displaying modified expression, an accurate estimate 
of differential expression will not be obtained by 
25 reverse Northern analysis. In this case, a number of 
potential false positives generated using reverse 
Northern analyses, may in reality not be false positives, 
but instead may represent multiple cDNAs . This problem 
may be ameliorated by performing single strand 

3 0 conformational polymorphism (SSCP) or reverse Northern 

analyses using cloned cDNA populations (3 9,40) . 

The expression pattern of representative RSDD-derived 
cDNAs in Ell versus Ell-NMT and in a more expanded 
35 E11/E11-NMT progression cell culture series is shown in 
Figs. 3 and 4, respectively. Reverse Northern results 
correlated well with Northern blots using Ell and 
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Ell-NMT ("80% concordance) or a larger panel of cells 
differentially displaying the progression phenotype, 
including progression negative, Ell, CREF x Ell-NMT Fl, 
CREF X Ell-NMT F2 , Ell X Ell-NMT A6 , Ell X Ell-NMT 3b, 
5 Ell-NMT Aza Bl and Ell-NMT Aza CI, and progression 
positive Ell-NMT, CREF X Ell-NMT Rl, CREF X Ell-NMT R2 , 
Ell X Ell-NMT A6TD, Ell X Ell-NMT Ila, Ell-ras and 
Ell-HPV E6/E7. Sequence analysis of the various 

progression upregulated genes (PEGen) and progression 

10 suppressed genes (PSGen) identified both known and 
unknown genes (Table 2) . Known PEGen genes included 
PEGen 7 (HPV16 E1BP) , PEGen 8 (PFK-C), PEGen 21 (FIN 14) 
and PEGen 2 6 (poly ADP-ribose polymerase) and a known 
PSGen gene was PSGen 10 (ferritin heavy chain) . Two 

15 PEGen genes out of six were found to be novel (PEGen 14 

and PEGen 24) and two PSGen genes out of three were found 
to be novel (PSGen 12 and PSGen 13) (Table 2) . 
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Table 2. Differentially Expressed cDNA Fragments 
Cloned by RSDD 



5 Nomenclature Identity 



Homology 



PEGen 7 



PEGen 8 



10 PEGen 13 



PEGen 14 



PEGen 15 



PEGen 21 



PEGen 24 



PEGen 2 8 



PEGen 32 



PSGen 10 



PSGen 12 



3 0 PSGen 13 



PSGen 2 3 



HPV16 E1BP 



PFK-C 



Novel 



Novel 



Novel 



FIN 14 



Novel 



Novel 



Novel 



Ferritin Heavy Chain 

Novel 

Novel 

Novel 



Human 9 0 s 



Rat 100 ; 



Novel 



Novel 



Novel 



Mouse 94 



Novel 



20 PEGen 26 Poly ADP-ribose Polymerase Rat 100* 



Novel 

Novel 
Rat 100 

Novel 

Novel 

Novel 
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PSGen 24 Novel Novel 

PSGen 25 Novel Novel 

5 PSGen 26 Novel Novel 

PSGen 27 Novel Novel 

PSGen 28 Novel Novel 

10 

PSGen 2 9 Novel Novel 



15 PEGen 7 is expressed at ~ 5-fold higher levels in Ell-NMT 
than in Ell cells . PEGen 7 is ~90% homologous to 
16E1-BP, a cDNA encoding a protein identified using the 
yeast two-hybrid assay that interacts with human 
papillomavirus type 16 El protein (41) . 16E1-BP encodes 

2 0 a 4 32aa protein of unknown function but does contain an 

ATPase signature motif (Gly-X 4 -Gly consensus ATP binding 
motif at aa 179 through 186) . 16E1-BP appears to be a 
form of TRIP13 , a protein previously shown to bind 
thyroid hormone receptor in yeast two-hybrid assays. The 
25 role of PEGen 7/16E1-BP in the progression phenotype in 
the Ell/Ell -NMT progression model is not known. 
Additional studies are necessary to determine if this 
gene change is associative or causative of transformation 
progression . 

30 

PEGen 8 is expressed at "3- to 4- fold higher levels in 
Ell-NMT than in Ell cells. PEGen 8 shows 100% homology 
to rat phosphofructokinase C (PFK-C) (42) . PFK catalyzes 
the rate-limiting and committed step in glycolysis, the 

3 5 conversion of fructose 6 -phosphate to fructose 

1 , 6-biphosphate . Three subunit isozymes of PFK have been 
identified, that form homo- and heterotetramers with 
differing catalytic and allosteric properties . PFK-M is 
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specific for cardiac and skeletal muscle, PFK-L is 
expressed in many tissues but is most abundant in the 
liver and PFK-C is expressed in several brain regions and 
the anterior pituitary but not in liver, skeletal muscle, 
5 or several other human tissues. The cDNA of PFK-C 
isolated from a rat hypothalamic cDNA library is 2643 bp 
and encodes a protein of 76.5aa (42) . In a recent study, 
Sanchez-Martinez and Aragon (43) demonstrated that PFK-C 
is the predominant form of PFK in ascites tumor cells 
(obtained from a transplantable mouse carcinoma of 
mammary origin) , whereas PFK-L is most abundant in the 
normal mammary gland. These results suggest the 

interesting possibility that PFK-C might contribute to 
the malignant nature of specific target cells. The role 
of PEGen 8/PFK-C in progression in the Ell/Ell-NMT model 
remains to be determined; 

PEGen 21 is expressed at ~3- to 4-fold higher levels in 
Ell-NMT than in Ell cells. PEGen 21 displays 94% 
homology with the fibroblast growth factor-4 inducible 
gene FIN-14 (44). FIN-14 is a novel cDNA of unknown 
function that hybridizes with a 4.5 kb mRNA that is 
induced 4 -fold in NIH3T3 mouse cells following treatment 
with FGF-4. The induction of FIN-14 occurs late (18 hr) 
after treatment with FGF-4 and does not occur when cells 
are treated for 18 hr with FGF-4 in the presence of 
cycloheximide (44). These results confirm that FIN-14 
encodes a late- inducible gene. Moreover, nuclear run-on 
assays document that FIN-14 is trancriptionally activated 
in NIH3T3 cells following growth factor stimulation. 
Tissue distribution studies indicate expression of a 
single mRNA species in the kidney with low levels of 
expression observed in several other tissues including 
testis and thymus. Mouse embryogenesis studies indicate 
that FIN-14 expression occurs const itutively in mouse 
embryos between day 10.5 and 15.5. Unlike NIH3T3, FIN-14 
constitutively expressed in PC12 cells and its level 
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did not vary appreciably in response to growth factor 
stimulation. The role of PEGen 21/FIN-14 in progression 
in E11/E11-NMT model system is not currently known. 

5 The PSGen cDNAs, PSGen-12 and PSGen-13, consist of 
sequences without homology to those presently reported in 
various DNA databases. Expression of these cDNAs is ~3- 
to 4-fold higher in Ell versus Ell-NMT cells (Fig. 3) . 
It is not currently known whether these genes simply 
10 correlate with or functionally regulate the progression 
phenotype. The identification of full-length cDNAs for 
PSGen-12 and PSGen-13 are in progress and once identified 
experiments can be conducted to directly define the role 
of these PSGen 1 s in cancer progression. 

15 

We presently demonstrate that a modified differential RNA 
display technique, RSDD, can efficiently identify 
differentially expressed cDNAs . As predicted, 

subtractive hybridization prior to differential RNA 

20 display greatly reduces band complexity, a problem 
encountered in standard DDRT-PCR in which RNA samples are 
directly analyzed without subtraction. Unlike a previous 
report using subtracted cDNAs processed through 
successive rounds of PCR (32,45), common bands were 

25 eliminated using reciprocally subtracted cDNA libraries 
that had not been processed using PCR. In addition to 
subtraction hybridization, the discovery of 

differentially expressed genes was further streamlined by 
using reverse Northern analyses with isolated cDNAs . 

3 0 With 3 single anchored oligo dT primers and 18 arbitrary 
5' primers, 72 bands were identified that displayed 
differential expression using reverse Northern analysis. 
Currently, 4 0 of these cDNA species have been analyzed by 
Northern blotting and found to display differential 

35 expression in Ell versus Ell-NMT cells. Subsequent 
studies with the majority of these RSDD cDNAs 
demonstrated coordinated expression with the progression 
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phenotype in a large panel of unprogressed and progressed 
transformed cells. Current sequence analysis of the 
cloned cDNA fragments revealed 9 different genes, 
including 4 novel genes not reported in recent DNA 
5 databases. RSDD represents a method of choice either as 
a more efficient and less time consuming modification of 
the differential RNA display strategy or as a screening 
methodology for identifying differentially expressed 
genes in reciprocally subtracted cDNA libraries. 

10 
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Seeond Series of Experiments 

Presently described is a RSDD approach that efficiently 
and consistently reduces the complexity of DDRT-PCR and 
5 results in the identification and cloning of genes 
displaying anticipated differential expression. The model 
used for RSDD was an adenovirus- transformed rat embryo 
cell line, Ell, that acquires an aggressive oncogenic 
progression phenotype when injected into athymic nude 
10 mice and reestablished in cell culture 

(Ell-NMT) (6,26,27) . Injection of Ell cells into nude mice 
results in tumors in 100% of animals with a tumor latency 
time of approximately 3 5 to 4 0 days, whereas Ell-NMT 
cells form tumors in 100% of nude mice with a tumor 
15 latency time of 15 to 20 days (6, 26,27). Additionally, 
Ell cells form colonies in agar with an efficiency of ~3 
%, whereas Ell-NMT display an agar cloning efficiency of 
>30% (6,26,27) . The increased tumorigenicity and enhanced 
anchorage independence phenotypes are key indicators of 
20 tumor progression in the Ell/Ell-NMT model system 
(6,26,27). RSDD has resulted in the identification and 
cloning of genes displaying elevated expression in 
progressed tumor cells (progression elevated gene, PEGen) 
and suppressed expression in progressed tumor cells 

2 5 (progression suppressed gene, PSGen) . 

MATERIALS AND METHODS 

RNA isolation and cDNA library construction. Total RNA 
from Ell and Ell-NMT cells was isolated by the 

3 0 guanidinium isothiocyanate/CsCl centrif ugation procedure 

and poly (A) + RNA was purified with oligo(dT) cellulose 
chromatography (5) . Two A-ZAP cDNA libraries from Ell and 
Ell-NMT mRNAs were constructed with A- ZAP cDNA library 
kits (Stratagene) following the manufacturer's protocol. 
3 5 Reciprocal subtraction between Ell and Ell-NMT libraries 
was performed and two subtracted cDNA libraries (Ell 
minus Ell-NMT and Ell-NMT minus Ell) were constructed as 
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described (5, 6) . Plasmid cDNA libraries from the 
subtracted A- ZAP cDNA libraries were obtained by in vivo 
excision following the manufacturer 1 s protocol 
(Stratagene) and the plasmids were isolated with Qiagen 
5 columns (Qiagen, Chatsworth, CA. ) . 

RSDD methodology. The purified plasmids of reciprocally 
subtracted cDNA libraries were directly subjected to 
differential display as in Liang et al . (28) with minor 

10 modifications. The plasmids of reciprocally subtracted 
cDNA libraries were PCR-amplif ied with the combination of 
three single-anchor 3* primers (T 13 A, T 13 C or T 13 G) and 18 
arbitrary 5' 10-mer primers obtained from Operon 
Technology Inc. (Alameda, CA. OPA 1-20 except 0PA1 and 

15 3) . The 20 /zl PCR reaction consisted of 10 mM Tris-HCl 

(pH 8.4), 50 mM KC1 , 1 . 5 mM MgCl 2 , 2 fiM each dNTP, 0.2 fxM 
5' arbitrary primer, 1 fiM 3 1 anchor primer, 50 ng of 
plasmid of a subtracted library, 10 /xCi a- 35 S-dATP (3,000 
Ci/mmol from Amersham) and 1 unit of Taq DNA polymerase 

20 (Gibco/BRL) . The parameters of PCR were 30 sec at 95°C, 40 

cycles of 30 sec at 95°C, 2 min at 40°C and 30 sec at 72°C 
and additional 5 min. at 72°C. After the cycling, 10 /xl of 
95% formamide, 0.05% bromophenol blue and 0.05% xylene 
cyanol were added to each PCR reaction. The mixture was 

25 heated at 95°C for 2 min and separated in a 5% denaturing 
DNA sequencing gel maintained at 50°C. PCR reactions of 
plasmids from each subtracted library in a primer set 
were run side by side . Differentially amplified bands 
from plasmids of each subtracted library were marked with 

3 0 18G needle through the film and cut out with a razor. The 
gel slice was put in 100 fil TE (pH 8.0) and incubated at 
4°C overnight. After the incubation, the mixture was 
boiled for 5 min and microcentrif uged for two min. The 
supernatant was collected and stored at -20°C until 

35 reamplif ication. The band extract was reamplified with 
the same cycling parameters in a 50 /il reaction 
consisting of 10 mM Tris-HCl (pH 8.4), 50 mM KC1 , 1 . 5 mM 
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MgCl 2 , 20 iM each dNTP , 0.2 jzM 5' arbitrary primer, 1 M M 
3' anchor primer, 5/*l of band extract and 2.5 units of 
Taq DNA polymerase (Gibco/BRL) . 

5 Reverse Northern Blotting Procedure. Differential 
expression, of - the reamplified DNA fragment was 
scrutinized by reverse Northern and Northern blot 
analyses. In reverse Northern analysis, after 
confirmation in a 1% agarose gel, the reamplified DNA 
fragment (10 fil of PCR reaction) was mixed with 90 |il TE 
and spotted on a positively charged Nylon membrane 
(Boehringer Mannheim) with a 96-well vacuum manifold. The 
membrane was soaked with denaturing and neutralizing 
solution successively, and the spotted DNA was 
crosslinked to the membrane with a UV crosslinker 
(Stratagene) . 32 P-labeled first strand cDNA was prepared 
by reverse transcription of total RNA. After heating at 
70°C for 10 rain and quenching on ice for two min, 0.4 |iM 
each T a3 A, T 13 G and T, 3 C and 10 fig total RNA mixture was 
added with 50 mM Tris-HCl, (pH 8.3), 75 mM KCl , 3 mM 
MgCl 2 , 10 mM DTT, 0 . 5 mM dATP, 0 . 5 mM dGTP, 0 . 5 mM dTTP, 
0.02 mM dCTP, 0.5 Ml RNase inhibitor (Gibco/BRL), 100 fiCx 
dCTP (3,000 Ci/mmol from Amersham) and 200 units 
Superscript RT II (Gibco/BRL) in a final 25 M l reaction. 
The reaction mixture was incubated at 42°C for one hour 
and at 37°C for 30 min after addition of 2 fil of RNase H 
(10 units, Gibco/BRL) . The membrane was hybridized at 42°C 
overnight in a 50% formamide hybridization solution. The 
hybridized membrane was washed at room temperature for 15 
3 0 min with 2X standard saline citrate containing 0.1% SDS 
twice and at 55°C for at least one hour with 0 . IX Standard 
Saline Citrate containing 0.1% SDS, successively. The 
membrane was probed with the 32 P- labeled cDNA of Ell, 
striped off and probed with 32 P- labeled cDNA of Ell-NMT. 
35 The signal intensity of each spot was normalized against 
that of glyceraldehyde- 3 -phosphate dehydrogenase and 
compared between Ell and Ell-NMT. Reamplified DNA 
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fragments displaying differential expression levels 
>1. 8-fold higher between the two cell types were selected 
and analyzed by Northern bloting analysis. 

5 Northern Blotting Analysis. In Northern blot analysis, 10 
/xg of total RNA from Ell and Ell-NMT cells were run 
side-by-side in a 1% agarose gel with formaldehyde and 
transferred to a positively charged Nylon membrane. 
Reamplif ication reaction (5 /il) was 32 P.- labeled with a 
10 multiprime labeling kit (Boehringer Mannheim) used to 
probe the membrane as described above. DNA fragments 
expressed differentially between Ell and Ell-NMT in 
Northern blot analyses were cloned into the EcoRV site of 
the pZEro-2.1 cloning vector (Invitrogene) and sequenced. 

15 

To confirm differential expression, the cloned cDNA 
fragment was released by EcoRI-XhoI, 32 P- labeled and used 
to probe Northern blots as described above. Samples of 
RNAs from various Ell and Ell-NMT derivatives displaying 

2 0 either a progressed or suppressed progression phenotype, 
based on nude mice tumorigenesis and soft agar cloning 
assays were analyzed. These included Ell, Ell-NMT, CREF 
x Ell-NMT Fl and F2 somatic cell hybrids (suppressed 
progression phenotype) , CREF x Ell-NMT Rl and R2 somatic 

2 5 cell hybrids (progression phenotype) , Ell x Ell-NMT AG 
somatic cell hybrid (suppressed progression phenotype) , 
Ell x Ell-NMT A6TD tumor-derived somatic cell hybrid 
(progression phenotype) , Ell x Ell-NMT 3b somatic cell 
hybrid (suppressed progression phenotype) , Ell x Ell-NMT 

30 iia (progression phenotype) , Ell-NMT AZA Bl and CI 
5 - azacytidine treated Ell-NMT clones ( suppressed 
progression phenotype) , Ell-Ras R12 clone containing the 
Ha-ras oncogene (progression phenotype) and Ell-HPV E6/E7 
clone containing the human papilloma virus- 18 EG and E7 

35 gene region (progression phenotype) . Differential 
expression of the PEGen and PSGen genes in the various 
cell types was confirmed using 32 P- labeled probes and 
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northern hybridization analysis. After reconfirmation of 
differential expression, the plasmids containing the 
differentially expressed DNA fragments were sequenced by 
the dideoxy sequencing procedure. 

5 

RESULTS AND DISCUSSION 

Subtraction hybridization provides a direct means of 
enriching for unique cDNA species and eliminating common 
sequences between complex genomes (7, 18) . DDRT-PCR is a 

10 proven methodology for the rapid identification and 
cloning of differentially expressed sequences between 
cell types (1,2,28). In principle, subtraction 
hybridization combined with DDRT-PCR should reduce band 
complexity which often obscures the identification of 

15 differentially expressed genes and generates false 
positive signals (21,29). RSDD has been used to analyze 
genes differentially expressed during transformation 
progression (Fig. 28). Differential RNA display was 
directly performed with reciprocally subtracted cDNA 

20 plasmid libraries (Ell minus Ell-NMT and Ell-NMT minus 
Ell) that had not been subjected to PCR . Three single 
anchored oligo dT 3 ' primers were used for subsequent 
amplification prior to display. To further streamline the 
DDRT-PCR procedure, reamplified cDNAs identified using 

25 RSDD were analyzed using the reverse Northern blotting 
procedure (30,31). cDNAs displaying differential 
expression by reverse Northern blotting were subsequently 
confirmed for true differential expression by Northern 
analysis . 

30 

The differential RNA display pattern of Ell and Ell-NMT 
cells using standard differential RNA display (DDRT-PCR) 
and RSDD is shown in Fig. 1 (Left Panel) . The 
differential RNA display pattern of RSDD is much less 
3 5 complex than that of DDRT-PCR. These experiments 
demonstrate that subtractive hybridization prior to 
differential RNA display is effective in simplifying 
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display patterns permitting the efficient identification 
of differentially expressed cDNAs . Since RSDD 
significantly reduced the number of bands displayed, 
single anchor oligo dT primers, that can increase band 
5 numbers, were successfully used in subsequent 
applications of the RSDD approach (Fig. 1; Right Panel) . 
Using RSDD, 234 differentially displayed cDNAs in the 
E11/E11-NMT tumor progression model system were isolated. 
Hakvoort et al . (25) used a reciprocal subtraction 
10 approach to analyze gene expression changes resulting 
during liver regeneration following 70% hepatectomy, 
i.e., normal liver subtracted from partially 
hepatectomized regenerating liver and vice versa. 
Although some bands displayed apparent enrichment, the 
15 complexity of the display pattern did not show 
appreciable simplification. In contrast, RSDD results in 
a clearer delineation and simplification of 
differentially expressed amplified bands (Figs. 1). 
Although conceptually similar, RSDD is significantly more 
20 effective than the subtraction plus DDRT-PCR approach 
described by Hakvoort et al . (25) The reasons for the 
improved efficiency of RSDD versus the Hakvoort et al . 
(25) approach are not known. One possibility is that the 
differences between the experimental approaches may 
25 reflect the subtraction hybridization strategies 
employed. The approach of Hakvoort et al . (25) is based 
on the subtraction procedure described by Wang and Brown 
(32) . This approach uses multiple rounds of 
PCR-amplif ication prior to each round of subtractive 
3 0 hybridization. In contrast, RSDD involves a single round 
of reciprocal subtraction without intermediate 
amplification (5, 6) . In this respect, the complicated 
display pattern observed by Hakvoort et al . (25) even 
after three or four rounds of subtraction might result 
35 from reduced subtraction efficiency, PCR artifacts or a 
combination of these problems. Increasing the number of 
reactions by using two-base pair anchored oligo dT 
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primers did not reduce the complexity of displayed bands 
(25) . In these contexts, a critical component for the 
successful use of RSDD involves the use of an appropriate 
subtraction hybridization protocol, which can efficiently 
5 reduce cDNA complexity and generate stable populations of 
cDNAs for analysis. 

Previous studies demonstrate that different gene cloning 
strategies, including DDRT-PCR, . subtraction hybridization 

10 and electronic display, identify distinct subsets of 
differentially expressed genes (18) . These results 
suggest that a single approach for gene identification 
may not identify the complete spectrum of differentially 
expressed genes. Similarly, RSDD and DDRT-PCR do not 

15 resolve the same differentially expressed bands (Fig.. 1) . 

Unique bands identified in DDRT-PCR that were 
differentially expressed when analyzed by Northern 
blotting were not the same as those found using RSDD and 
vise versa (data not shown) . These results are not 

2 0 surprising, since, as indicated above, subtraction 

hybridization and differential RNA display identified 
distinct differentially expressed genes (18) . Apparently, 
specific differentially expressed genes are lost during 
subtraction hybridization and differential RNA display of 
25 subtracted cDNAs . On the basis of these considerations, 
it will be essential to use multiple gene discovery 
approaches to identify and clone the complete spectrum of 
differentially expressed genes. 

3 0 DDRT-PCR can generate large numbers of differentially 

displayed bands making subsequent analysis both labor 
intensive and a daunting challenge. In order to reduce 
these limitations of DDRT-PCR, RSDD has been used in 
combination with reverse Northern analyses of isolated 
35 cDNAs. Gel extracted cDNA fragments were reamplified, 
dot -blotted on Nylon membranes and successively probed 
with reverse transcribed 32 P-cDNA from Ell or Ell-NMT RNAs 
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(Fig. 2) . Signals were detected in 181 reamplified bands 
out of 234 (77%) . 

The signal intensities of the various cDNAs in reverse 
5 Northern analysis were quantified and normalized against 
that* of GAPDH, which remained unchanged in Ell and 
Ell-NMT cells. Progression elevated gene-3 (PEG-3) (6) was 
used as an additional control, to verify increased 
expression in Ell-NMT versus Ell cells. In the reverse 

10 Northern analyses, PEG-3 levels were 4 -fold higher in 
Ell-NMT than in Ell cells, which coincided with Northern 
blotting results, thereby demonstrating the concordance 
of reverse Northern and Northern assays. A > 1.8-fold 
differential cut-off (after normalization for GAPDH 

15 expression) was used to identify and isolate cDNA bands 
displaying modified expression in Ell versus Ell-NMT 
cells. This resulted in the identification of 7 cDNAs 
with higher expression in Ell versus Ell-NMT cells and 65 
cDNAs with elevated expression in Ell-NMT versus Ell 

20 cells. These results suggest that tumor progression in 
Ell-NMT cells correlates with increased expression of a 
large number of genes, whereas only a smaller subset of 
genes display decreased expression. 



25 A problem frequently encountered in DDRT-PCR, that is 
reduced but still can occur in RSDD, is the isolation of 
multiple cDNA species from what appears to be a single 
amplified band. When this occurs, these multiple species 
can produce spurious results when analyzed by reverse 

30 Northern analyses. For example, if two distinct species 
are isolated, one displaying modified expression and a 
second not displaying modified expression, an accurate 
estimate of differential expression will not be obtained 
by reverse Northern analysis. In this case, a number of 

35 potential false positives generated using reverse 
Northern analyses, may in reality not be false positives, 
but instead may represent multiple cDNAs. By performing 
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single strand conformational polymorph! sm (SSCP) or 
reverse Northern analyses using cloned cDNA populations 
(33,34) this problem can be ameliorated. 

5 The expression pattern of representative RSDD-derived 
cDNAs in Ell versus Ell-NMT and in a more expanded 
Ell/Ell-NMT progression cell culture series is shown in 
Figs. 29 and 30 , respectively. Reverse Northern results 
correlated well with Northern blots using Ell and Ell-NMT 

10 ("75% concordance) or a larger panel of cells 

differentially displaying the progression phenotype, 
including progression negative Ell, CREF x Ell-NMT Fl and 
F2, Ell x Ell-NMT A6 , Ell x Ell-NMT 3b, Ell-NMT Aza Bl 
and Aza CI cells, and progression positive Ell-NMT, CREF 

15 x Ell-NMT Rl and R2 , Ell x Ell-NMT A6TD, Ell x Ell-NMT 

Ha, Ell-Ras R12 and Ell-HPV E6/E7 cells. Sequence 
analysis of the various PEGen cDNAs identified both 
unknown and known genes (Table 3) . Five of 10 PEGen cDNAs 
(50%) were classified as novel sequences since no matches 

20 were found in current DNA databases. Novel PEGen cDNAs 
include, PEGen 13, 14, 24, 28 and 32. Known PEGen genes 
included PEGen 7 (human papilloma virus- 16 early region 
1 binding protein; HPV16 E1BP) , PEGen 8 

(phosphofructokinase kinase C; PFK-C) , PEGen 21 (a 

25 fibroblast growth factor-4 inducible gene; FIN 14) , PEGen 
26 (poly ADP-ribose polymerase) and PEGen 30 (rat espl 
homology) . In the case of the PSGen cDNAs, six of six 
(100%) were novel, including PSGen 12, 13, 26, 27, 28 and 
29 (Table 3) . 

30 
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Table 


3 • PEGen and 


PSGen genes isolated using 


KbJJJJ. 


Nomenclature 3 


Identity 6 


Homology 


(%) c 








PEGen 


7 


Human HPV16 ElBP 


30 






Rat phospho- 




PEGen 


13 


fructokinase C (PFK-C) 
Unknown 


100 

Novel 


PEGen 


14 


Unknown 


Novel 


PEGen- 


21 


Murine FIN 14 


94 


PEGen 


24 


Unknown 


Novel 


PEGen 


26 


Rat poly ADP-ribose 




PEGen 


28 


polymerase 100 
Unknown 


Novel 


PEGen 


30 


Rat espl 


98 


PEGen 


32 


Novel 


Novel 


PSGen 


12 


Unknown 


Novel 


PSGen 


13 


Unknown 


Novel 


PSGen 


26 


Unknown 


Novel 


PSGen 


27 


Unknown 


Novel 


PSGen 


28 


Unknown 


Novel 


PSGen 


29 


Unknown 


Novel 



a PEGen are progression elevated genes that display 
elevated expression in Ell-NMT versus Ell cells. PSGen 

2 5 are progression suppressed genes that display elevated 

expression in Ell versus Ell-NMT cells. 

Sequences have compared with reported genes in various 
DNA data bases (including GenBank and EMBL.) and 
identification with known genes are indicated. Genes 

3 0 without homology to currently reported genes are 

indicated as unknown. 

percentage homology with known sequences, either human, 
rat or mouse is indicated. 

Where no homology exsists the cDNA is considered novel. 
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PEGen 7 is expressed at ~ 4 -fold higher levels in Ell-NMT 
than in Ell cells. PEGen 7 is "98% homologous to 16E1-BP, 
a cDNA encoding a protein identified using the yeast 
5 two-hybrid assay that interacts with human papillomavirus 
type 16 El protein (35) . 16E1-BP encodes a 432aa protein 
of unknown function but does contain an ATPase signature 
motif (Gly-X4-Gly consensus ATP binding motif at aa 179 
through 186). 16E1-BP appears to be a form of TRIP13, a 

10 protein previously shown to bind thyroid hormone receptor 
in yeast two-hybrid assays. The role of PEGen 7/16E1-BP 
in the progression phenotype in the Ell/Ell -NMT 
progression model is not known. Additional studies are 
necessary to determine if this gene change is associative 

15 or causative of transformation progression. 

PEGen 8 is expressed at ~3- to 4- fold higher levels in 
Ell-NMT than in Ell cells. PEGen 8 shows 100% homology to 
rat phosphofructokinase C (PFK-C) (36) . PFK catalyzes- the 
20 rate- limiting and committed step in glycolysis, the 
conversion of fructose 6 -phosphate to fructose 
1, 6-biphosphate . Three subunit isozymes of PFK have been 
identified, that form homo- and heterotetramers with 
differing catalytic and allosteric properties. PFK-M is 
25 specific for cardiac and skeletal muscle, PFK-L is 
expressed in many, tissues but is most abundant in the 
liver and PFK-C is expressed in several brain regions and 
the anterior pituitary but not in liver, skeletal muscle, 
or several other human tissues. The cDNA of PFK-C 
3 0 isolated from a rat hypothalamic cDNA library is 2643 bp 
and encodes a protein of 765aa (-36) . In a recent study 
Sanchez -Martinez and Aragon (37) , demonstrated that PFK-C 
is the predominant form of PFK in ascites tumor cells 
(obtained from a transplantable mouse carcinoma of 
35 mammary origin) , whereas PFK-L is most abundant in the 
normal mammary gland. These results suggest the 
interesting possibility that PFK-C might contribute to 
the malignant nature of specific target cells. The role 
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presently reported of PEGen 8/PFK-C in progression in the 
Ell/Ell-NMT model remains to be determined. 

PEGen 21 is expressed at ~3- to 4 -fold higher levels in 
5 Ell-NMT than in Ell cells. PEGen 21 displays "98% 
homology with the fibroblast growth factor- 4 inducible 
gene FIN- 14 (38) . FIN- 14 is a novel cDNA of unknown 
function that hybridizes with a 4.5 kb mRNA that is 
induced 4 -fold in NIH 3T3 mouse cells following treatment 

10 with FGF-4. The. induction of FIN- 14 occurs late (18 hr) 
after treatment with FGF-4 and does not occur when cells 
are treated for 18 hr with FGF-4 in the presence of 
cycloheximide (38). These results confirm that FIN-14 
encodes a late- inducible gene. Moreover, nuclear run-on 

15 assays document that FIN-14 is transcriptionally 
activated in NIH 3T3 cells following growth factor 
stimulation. Tissue distribution studies indicate 
expression of a single mRNA species in the kidney with 
low levels of expression observed in several other 

20 tissues including testis and thymus. Mouse embryogenesis 
studies indicate that FIN-14 expression occurs 
constitutively in mouse embryos between day 10.5 and 
15.5. Unlike NIH 3T3, FIN-14 was constitutively expressed 
in PC12 cells and its level did not vary appreciably in 

25 response to growth factor stimulation. The role of PEGen 
21/FIN-14 in progression in Ell/Ell-NMT model system is 
not current ly known . 

PEGen 26 is expressed at ~3- to 4 -fold higher levels in 
30 Ell-NMT than in Ell cells. This cDNA is identical to rat 
poly (ADP-ribose) polymerase (PARP) (39) . PARP contributes 
to the ability of eukaryotic cells to contend with both 
environmental and endogenous genotoxic agents (40) . PARP 
is a nuclear enzyme that binds to DNA breaks and then 
35 catalyzes the covalent modification of acceptor proteins 
with poly (ADP-ribose) (39,40). PARP activity contributes 
to the recovery of proliferating cells from DNA damage 
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and to the maintenance of genomic stability, which may be 
regulated by effects on chromatin structure, DNA 
base-excision repair and cell cycle regulation (39,40). 
The role of PEGen 26/PARP in mediating the progression 
5 phenotype. is not currently known. However, since cancer 
is a progressive disease characterized by the 
accumulation of genetic alterations in the evolving tumor 
(6), it is tempting to speculate that overexpression of 
PEGen 26/PARP in Ell-NMT may facilitate the ability of 
10 these aggressive cancer cells . to maintain genomic 
stability during cancer progression. In this context, 
PEGen 2 6/PARP may be an integral component of 
progression. This hypothesis is readily testable. PEGen 
30 is expressed at 2- to 3 -fold higher levels in Ell-NMT 
15 than in Ell cells. This cDNA displays ~98.5% homology to 
rat espl (41) . Rat espl encodes a 24-kDa nuclear protein 
which is the rat homologue of Drosophila Enhancer of 
split . , a gene involved in ventral ectodermal development 
in Drosophila (41) . PEGen 30 appears to be a homologue of 
20 espl, since the message detected in Ell and Ell-NMT cells 
(~4 kb) is larger in size than the reported espl 
transcript (1.3 kb) (41) . The role of PEGen 30/espl in 
tumor progression in Ell/Ell-NMT model system remains to 
be determined. 



25 



30 



35 



The PSGen cDNAs , 12, 13, 26, 27, 28 and 29, consist of 
sequences without homology to those in various DNA data 
bases. Expression of PSGen 12 and PSGen 13 cDNAs is ~3- 
to 4-fold higher in Ell versus Ell-NMT cells (Fig. 29) . 
It is not currently known whether these genes simply 
correlate with or functionally regulate the progression 
phenotype. The identification of full-length cDNAs for 
PSGen-12 and PSGen-13, as well as the other novel PSGen 
and PEGen cDNAs, are in progress and once isolated 
experiments can be conducted to directly define the role 
of these progression- related genes in cancer progression. 
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Presently demonstrated is a modified gene-identification 
and gene-cloning technique, RSDD, that can efficiently 
identify differentially expressed cDNAs . As predicted, 
subtractive hybridization prior to differential RNA 
display greatly" reduces band complexity, a problem 
encountered in standard DDRT-PCR in which RNA samples are 
directly analyzed without subtraction. Unlike a previous 
report using subtracted cDNAs processed through 
successive rounds of PCR ,(25,42), common bands were 
eliminated using reciprocally subtracted cDNA libraries 
that had not been processed using PCR. In addition to 
subtraction hybridization, the discovery of 

differentially expressed genes was further streamlined by 
using reverse Northern analyses with isolated cDNAs . With 
3 single anchored oligo dT primers and 18 arbitrary 5' 
primers, 72 bands were identified that displayed 
differential expression using reverse Northern analysis. 
Currently, 38 cDNA species have been analyzed by Northern 
blotting and 31 (~82%) displayed differential expression 
in Ell versus Ell-NMT cells. Sequence analysis of the 
cloned cDNA fragments revealed 16 different genes, 
including 11 novel genes not reported in recent DNA 
databases. RSDD represents a method of choice either as 
a more efficient and less time consuming modification of 
the differential RNA display strategy or as a screening 
methodology for identifying differentially expressed 
genes in reciprocally subtracted cDNA libraries. 
Moreover, the ability of RSDD to identify differentially 
expressed genes that are dissimilar to those recognized 
using standard DDRT-PCR or subtraction hybridization 
indicates that this approach will be a valuable adjunct 
in cloning the complete repertoire of differentially 
expressed gene changes occurring between complex genomes . 
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What is claimed is: 

1. A method for identifying differentially expressed 
nucleic acids between two samples, comprising: 

5 a. selecting a first and second nucleic acid 

sample, wherein the nucleic acid samples 
contain a repertoire of nucleic acids; 

b. performing reciprocal subtraction between 
the nucleic acid samples to produce two 

10 subtracted nucleic acid samples ; 

c. amplifying the two subtracted nucleic 
acid samples; and 

d. comparing the two subtracted nucleic acid 
samples to identify differentially 

15 expressed nucleic acids. 

2. A method for identifying differentially expressed 
nucleic acids between two samples, comprising: 

a. selecting a first and second nucleic acid 
2 0 sample, wherein the nucleic acid samples 

contain a repertoire of nucleic acids; 

b. amplifying the two nucleic acid samples; 

c. performing reciprocal subtraction between 
the amplified nucleic acid samples to 

2 5 produce two subtracted nucleic acid 

samples; and 

d. comparing the two subtracted nucleic acid 
samples to identify differentially 
expressed nucleic acids. 

30 

3. The method of claim 2, wherein the two subtracted 
nucleic acid samples from step c are amplified prior 
to the comparing of step d. 

35 4. The method of claim 1 or 2 , wherein the each of the 
nucleic acid samples comprises a library of nucleic 
acids . 
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5. The method of claim 1 or 2, wherein the nucleic acid 
samples are mRNA or cDNA derived from mRNA. 

6. The method of claim 1 or 2, wherein the nucleic acid 
samples are obtained from total RNA from Ell and 
Ell-NMT cells. 

7. The method of claim 1 or 2 , wherein the first and 
second nucleic acid 'samples are obtained from cells 
that differ in their exposure to external factors or 
in their gene expression. 



8. The method of claim 1 or 2, wherein the first and 
second nucleic acid samples are obtained from cells 

15 in different developmental stages. 

9. The method of claim 1 or 2 , wherein the amplifying 
of step (d) comprises PCR amplification. 



10. The method of claim 9, wherein the PCR amplification 
uses a set of random primers . 



11. The method of claim 9, wherein the 3' primer used in 
the PCR amplification is a single anchor oligo dT 3 ' 

25 primer. 

12. The method of claim 9, wherein the 5' primer is an 
arbitrary primer. 

3 0 13- The method of claim 1 or 2, wherein the comparing of 
step (e) comprises using a gel to separate the 
nucleic acids from both of the libraries. 

14. The method of claim 1 or 2, further comprising PCR 
35 amplifying the first and second nucleic acid 

samples . 
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15. The method of claim 1 or 2, further comprising 
reamplifying differentially expressed nucleic acids. 

16. The method of claim 1 or 2, wherein the comparing of 
5 step (e) comprises comparing the quantities of the 

two amplified differentially expressed nucleic 
acids . 

17. The method of claim 1 or 2 , wherein differences in 
10 the quantities of nucleic acid between the two 

subtracted libraries are electronically quantified. 

18. The method of claim 1 or 2, wherein the libraries of 
step (b) are constructed with A- ZAP cDNA library 

15 kits. 

19. The isolated nucleic acid identified by the method 
of claim 1 or 2, wherein the nucleic acid was not 
previously known. 

20 

20. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PSGen 12. 

25 21. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PSGen 13 . 

22. The isolated nucleic acid of claim 19, wherein the 
3 0 isolated nucleic acid is the nucleic acid designated 

PSGen 23. 

23. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 

3 5 PSGen 24. 

24. The isolated nucleic acid of claim 19, wherein the 
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isolated nucleic acid is the nucleic acid designated 
PSGen 25 . 

The- isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PSGen 26. 

"26. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
10 PSGen 27. 

27. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PSGen 28. 



28 



The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PSGen 29. 



20 29 . 



30 



The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PEGen 13 . 

The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PEGen 14 . 



31 



The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
3 0 PEGen 15. 

32. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PEGen 24. 



33 . 



The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
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PEGen 28 . 

34. The isolated nucleic acid of claim 19 , wherein the 
isolated nucleic acid is the nucleic acid designated 

5 PEGen 3 2. 

35. The isolated nucleic acid of claim 19 , wherein, the 
isolated nucleic acid is the nucleic acid designated 
PEGen 4 2 . 

10 

36. The isolated nucleic acid of claim 19, wherein the 
isolated nucleic acid is the nucleic acid designated 
PEGen 43 . 

15 37. The isolated nucleic acid of claim 19,. wherein the 
isolated nucleic acid is the nucleic acid designated 
PEGen 44 . 

38. The isolated nucleic acid of claim 19, wherein the 
20 isolated nucleic acid is the nucleic acid designated 

PEGen 4 8. 

39. The isolated nucleic acid molecule of claim 19 which 
comprises : 

25 (a) one of the nucleic acid sequences as set forth 

in Figure 35 ; 

(b) a sequence being degenerated to a sequence of 
(a) as a result of the genetic code; 

30 

(c) a sequence encoding one of the amino acid 
sequences as set forth in Figure 35. 

(d) a sequence of at least 12 nucleotides capable 
3 5 of specifically hybridizing to the sequence of 

(a) , (b) or (c) 
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40. A purified polypeptide comprising one of the amino 
acid sequence as set forth in Figure 35.. 
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FIG. 5 

PEGen 7-90% homology to human HPV16 El BP 

TAAANCGGTG GTACTGCTGC ACGGTCCTCC GGGTACTGGA AAGACATCCC 
TTTGTAAGGO ATTAGCCCAG AAACTGACCA TCAGACTGTC . . AANCAGGTAC 
CGGTATGGCC AGTTAATTGA AATAAACAGC CACAGCCTAT TTTCTAAGTG 
GTNTTCAGAA AGTGGCAAGT TGGTAACTAA GATGTTCCAG AAGATTCANG 
ACTTGATTGA TGATAANNAA NCTTTGGTGT TTGTCCTGAT TGATGANGTA 
AGCACTCANN GGTACTCATT CTTNGTCTGC ATTGCCTCTT GCTATTACTG 
CCTGATCCCT CTCATTTGGT TCACTGTGTC GCNANCTCTT TTCTATGGAT 
CTTTTCCNAN CCACCCGTTT C 

FIG. 6 

PEGen 8 -Rat phosphof ructose kinase C 

CTGACGTAGG GTCTGTTGCG TCAATGGTTA TAGCAAGTGA TGCTCTCTGA 
TTATTACTGC TGACAATACT CGGCCAACAA TTCTTGCATA GAGTGCTGAT 
AAATAACTAT GTTACAAAAA GGGGTGGTCC CTGGAGAACA TTACAGGCTT 
CCCTAGGTAA GTGTGCAGGT CAGGAGACGG CATATTCAAT CAGATGGCTG 
ATAGTTCTCC GTGGTTATGC ACCGGCTCCA GCTTGCCTAC - GTCAC 

FIG. 7 

PEGen 13 -Novel 

rzc AGC ATGAT GAATTTAATG CAACAGTCAT AGCAGGGCAA GGGGAGAGAA 

aScagItgg actatctgca tcatgaagcg agggcttgtg TCGGCGGCTA 

TGTGCAGAGA CGAGCAGGGC GAGGCACTTA AAAGCTGCTN GATGAAAATC 
CACCCAGGAG AANTCTGGGC CTACGTCA 

TGACGTAGGC CCAGACTTCT CCTGGGTGGA TTTTCATCCA GCAGCTTTTA 

Ictgcctcgc CCTGCTCGTC TCTGCACATA gccgccgaca caagccctcg 
cSgSStg cagatagtcc atctgccttt ctctcccctt gccctgctat 
gactgttgca ttaaattcat catgctgcca aaaaaaaaaa a 

FIG. 8 

PEGen 14-Novel 

r^CATAAATA CACTTTATTT CATTCGAAAT GCATAATCAC ACTGGGAGCA 
CTCCCTTTGG AGCACTCCTC TAGCAGCAGG TCCGAAGTGC TCCAGCATCG 
TCAGCTGGCT CCAACACCTA CGTC 

FIG. 9 

PEGen 15-Novel 

TTTTTTTTTT TTTGGAAACA GAATAAAGTG CTTTATTCTC TGGCTGGCTC 
TCCT2CGTCA C 
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FIG. 10 

PEGen 21-94% homology to mouse FIN 14 

TCGGCGATAG CATTGGAGCA AGTCTTATCA GCAAGCAATG TTTTCAGTTA 
TGTTTCAAAG TTAAGAATGG GTTTAAACTT GCTGAACGTA AAGATTGACC 
CTCAAGTCAC TGTAGCTTTA GTACTTGCTT ATTGTATTAG TTTANATGCT 
AGCACCGCAT GTGCTCTGCA TATTCTGGTT TTATTAAAAT AAAAAGTTGA 
ACTGCAAAAA AAAAAA 

FIG. 11 

PEGen 2 4 -Novel 

TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TNGCCAGGCT 

ATGTCTCAGA CTTTATTATT ATTATTATTA TTATTATTAT TATAAATAAA 

ACATGTNCTT TCAATTAGGT TACAANAGTA TTTATCTCCA TAACGCTTCT 

TCATACATCC. TTAGTTTTGG ATTAAAGTAC CATCCACCCC AACTCAAACT 

GTAACCCCCA GTAATCCCCT CTAACGTGGA AATTTCTGGT TTAACAACTC 

AGTTAACTGC CCCACAAACA GTGGGAGGCC GCTCTTGCAT GGCTATGCCA 
CGTAACCCTT CACTGCTTCA CTTCTTCGCT GGCT 

FIG. 12 

PEGen 26-Rat poly ADP-ribose polymerase. 

GACCGCTTGT ACCATCCAAC TTGCTTTGTC TTCTGCAGAG AGGAGGCTAA 
AGCC CTTGAG CTGGCTGGCA CTGTACTCAG GCCGGAAGCC CAGCTCGTCC 
CGGTTCTTGA CAAAGCAAGT TGGATGGTAC AAGCGG 

FIG. 13 

PEGen 2 8 -Novel 

TGCCGAGCTG GGTATTGTGA C GGTTGATAA TGGCGGCATC ATGTTGCCAG 
GTACCGGGTA AGCAGACCTC AGAGCACAGC TTATTGTCCA GTGCTTTCAC 
GCTCGCGACG TCAAAGTCAT TGTTATTGTC ACACTCCATG CCTAGAAATG 
CGCATGTCGT CTGGCC ATCT TCTTGCACAG GGGATCTGTC CTCTTCCTCC 
ATGATATCAT TTCCCTCTGC ATCCTGCTCT CCAGCTGGAA GGCCAGCAAA 
ATTGCTGTCT GGGGACTCTG CTGGGGTCTC CTCCTCTTCT GAAGGGGCCC 
TGCTAGCAGC TCGGCA 

FIG. 14 

PEGen 4 2 -Novel 

AGGGGTCTTG ATGGACTTGG GTCGGACATC TTAGTGACCT GTGAATTCTT 
CTGTGGAGGC TGAGTCTCAC GTAGCCGAGT TTAATATCTG TGCTATTTAC 
TAAAGTATCT GCCACCAAAT TGTACCAACT CATAGTTTTA TATGAATGTT 
GATGAGTCTG TATCATAAAT AGAATTGTTG ATACATCCTT AATTTGTGCA 
ATATTGTATG AAGAAGATTG TTATCAATTA AAACCACGCC TCTTTATGAT 
CCTNNNAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AACCNCCTCA AATCCATNGG TTCTAACCCA AAACCCT 
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FIG. 15 

PEGen 43-Novel 

TTTTTTTTTT CATACACCAT CAAACCAATT TTATTTCTAT AGCAACGTTT 
CTCACGTCTG AACCTGAGAA TAAGTCACCA GCTCTTGACA GTAAACATGG 
GCCCTATCAA ATTATATTAG ACTCCTCAGT GTCCCGCCAT GTGGCCTTGC 
ArrAAATCAA TTAGTTTGAG GGCCAAAATC CTGTTGGGTT TCAAATAAAG 
TGTCAGGTCA TAAGGAGGGG GAGGGACTCA ATTCATGGGA ACATTTTTAC 
CTGTTCAAAT AGATAAACTG AATTGCC CTA TCTGTGGTCA CCTGGATCCA 
AGACCCT 

FIG. 16 

PEGen 4 4 -Novel 

CCCTGACGAT AAATGGTAAG GAACTTTTTT TTTTTTTTTT TTTTTTTTTT 
TTTTTTTTNC GAAATAAACA AACACAGCTT ATTATTTGGG GGAACATTAA 
NTTCTATAAN TGAACACAAA ANAAAATTAA NANTTAATGG GGGGGTANAA 
GGGACTTTGA ATCTATCTGG TATCATGACA TTGAAGCANA NACCTGANTG 
ACCAGAAAGA GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA GAGAGGTTTC 
ATATGAGCTA GTGTTACAGG CTTTATTAGT CTATTAGTCA GGGACC 

FIG. 17 

PEGen 4 8 -Novel 

AATC GGGCTG GATGGGTGTA TCCGGCACTG TTTCGTAGCG GCAGCAACTG 
GGTGCTTCTA TCTGAAAGCG GGCTTCACAA AAACTACTGC GCCACCCGAC 
TCGCTGCGGC ATCGCCCGGT GGCGAGTACC GTATCGCCTT TCCTGGTGCA 
GAAGAAGTGT TTACAGGAGG CGGTCATTTA CCGCAATCTG ATTCTGTTTT 
TTATTCTCCC TGGCGGGTGA TCGCGATCGG CAGTTTGAAA ACGATCGTTG 
AATC CACGCT CGGGAATGAT GTGGCTTCGC CGCCAACGCT TACTGACATT 
TCATTTGTAC AGCCCGATT 

PSGen 1-80% homology to B. taurus supervillin 

GCCGAGCTGT GTAAAACCAT CTATCCTCTG GCAGATCTAC TTGCCAGGCC 
a^tEccaggg GGGGTAGACC CTCTAAAGCT TGAGATTTAT CTTACAGATG 
AAGACTTCGA GTTTGCACTC GACATGACCA GAGATGAATT CAACGCACTG 
CCCACCTGGA AGCAAATGAA CCTGAAGAAA GCGAAAGGCC TGTTCTGAGG 
GTGAGATGAC AGCCACAGAG AGGT CACTGC CACTAGACCA GAAAGTGGAT 
GGAGATATAT ATTTGGACTG GTGTTTTTTT CTGTCAG 
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FIG. 19 

PSGen 2-91% homology to human HTLV-1 Tax interacting protein 

ATCGGGCTGC AGATTGGAGA CAAGATCATG CAGGTGAACG GCTGGGACAT 
GACCATGGTC ACTCATGACC AGGCTCGGAA GCGGCTCACC AAACGTTCGG 
AGGAAGTGGT CCGCCTGCTG GTGACTCGGC AGTCTCTGCA GAAGGCCGTA 
CAGCAGTCCA TGCTGTCATA GCTGTAGTCA GCCTAGACTT CTGCCCACTG 
ACCTTTTNGG GCACTGAGAA CACATCCACG CTCTGTCTGT ATCTAGTTCT 
GGCTTCTGCT GTGTGCTANG CCCCAGCTCT GAGGAGTAAC AGCTGATCCC 
AAAGGTCCAA GCCAACCTTC TTACCCCTCA GCCCCCANCC CGAT 

FIG. 20 

PSGen 4-Rat proteasome activator 

TTrriTrriT tttgggcaac tatgtattta ttgtgtttgg aaggcagagt 

GAGGGAGGAG ACCCCAGCAG GAAGAAGACT GGGTGCAGTC TAGAGTTCCT 
AGTCAAGAGT aggaaggttt ctgttatacc CATCATAGAA CGAGAGAGGG 
GGCTCAATAG ATCATCCCCT TTGTCTCTCC ACGGGGCTTC TTGAGCTTCT 
CAAAGTTCTT CAGGATGATG TCATATAACA CAGCATAAGC GTTACGGATC 
TCCATGACCA TCAGCCGGAT CTCCTGGTAT TCCGCCTCGT CCAGCTCGGC 

FIG. 21 

PSGen 10-Rat Ferritin Heavy Chain 

AANATCTGCT TAAAAGTTCT TTAATTTGTA CCATTTCTTC AAATAAAGAA 
TTTTGGTACA AATTAAAGAA CTTTTAAGCA GATGTTTTGG TGCAACTAAT 
AGAAAAGATA AAGGCAGCCT GACATGCATG CACTGCCTCA GTGACCAGTA 
AAGTCACATG NCCTTGGGAC GTCAGCTTAG NTTTATCACN GTGTCCCAGG 
GGTGCTTGTC AAAGAGATAT TCTGCCATGC CAGATTCAGG GGCTCCCATC 
TTGCGTAAGT TGGTCACGTG GTCACCCAGT TCTTTAATGG ATTTCACCTG 
CTCATTCAGG TAATGCGTCT CAATGAAGTC ACATAAGTGG GGATCAT TCT 
TGTCAGTAGC CAGTTTGTGA AGTTCCAGTA GTGACTGATT CACACTCTTT 
TCCAAGTGCA GTGCACACTC CATTGCATTC AGCCCGCTCT CCCAGTCATC 
ACGGTCACNT A 

FIG. 22 

PSGen 12-Novel 

TGAC GTAGGG CCGAGAGCAA CAAGCACAGA ACTCCTTCTC CAGTTTCACC 
CTGATGAAGT TGAGGCACTC TTCTGCACTG GGAGGGGCCA GCCTGGGGGC 
CAGGCACATT GGACACCACC TTCCCATGGA CTACAGCGTC AATGCCATTG 
CCTTCTATTC CTATACCTTC TAGGGGCTGC CCCTCTTCCC ATTCAGCCAA 
CACTGAGTGT TGGGAGATTT CTCTTTTTTA AAAACACATG AGAAAATAAA 
TGCACTTTAC TCCCTCCCCA AAAAAAAAAA 
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FIG. 23 

PSGen 13-Novel 

GTAGGCAATA AAATGTTTTC AG AGGTGC GA AAAAGCTTTT GTTTTCTTAA 
ACCATTCTTA GTCTCTGCCA CACTTGACAC TCCGTCAAAG TGAGAAGCGA 
ACTAAAGACC AACTGCGGTG GAAAATATTA TGTTTATGTA ATAAAAAAAA 
ATCATGTAAC TGCAAAAAAA AAAAAAA 

FIG. 24 

PSGen 23-Novel 

TGCCGAGCTG AAAACATACA TCCGCACCGG GTTGAGATAG CTGGCCCTCC 
GTCCCCGGGC ATACTC TTTG GATAAGAACC CCGGCCTTGT TACCAGGTAC 
CGGAGTGAGC TGAAAAATTT AC CGTCGAAA TGGGTGATGT CCTGGAAAAA 
ATGGTTCACC AGCTGC CAGG CAGATTCTTT GGGTTCCACA TTTTCCTGCC 
CACAGATGTG GCAGAAGCGG TCAAGTAATG CAGCATTACA ATTGAGGCAG 
ATCTTTTCTT TTCTTTCCTT GGAGTGGCTC AACCAGCGAT TTTGGTTAAA 
AATAATCAAA AAAGCGACGG CAAAACTTTT GTTATATTCC CGCCTGTGGC 
ATTTGAACTG TGCCCGGCAA CCGAATAACT TTTAATTTTG AAAATAAAAT 
GCATACTAGA TTTTTAGCGG TTGCCTCCTG GCCATTGCTT CAGGCGCCNG 
CACAGCGTCA GCCCAGTTTT ACCACNANGA ATATC CTAAG CGTTGAAACA 
GGGCACAGCC GAAAAAAACN CTGGCNACAA AAAANATCCG GACATCCTTT 
TTCCAATTTT GAAACCGAAN GCNCGCAAAC NAAGGTTCTT CGGGAAAAAA 
AATCGCCAAA ATACNCGANA TCAAACTNTC CAA 

FIG. 25 

PSGen 2 4 -Novel 

TGCCGAGCTG GGGGGAGTTC CAGGAATTTG TGGACTATTT CCAGGAGGAA 
TTGAGGAATC TAGAAGTAAT AAGAACTTCA CAAGTAGAAC AACAGAGTTA 
ATTGACCTCT ATCCTTAAGA GTTACCAGAG AATTATTAAA AAACTAAAGA 
ACAATCAAAG CCTGGTCCTG TGCCACCACC CAAAAACATG TATAGC CTAT 
GTGCAGCTCG GCA 
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FIG. 26 

PSGen 25-Novel 

CTCANAGGGC NNNTTNGNGG NCNTCATGCN CCAGGNTCCN NCCCCCANAN 
GANCNNCCNG GTAAACTACA CNGGAGTACT TAAGTGGACA NNCCACATGC 
GANGGNCAAG GGGATCACCN TCNCTCCTNC AGNCTNTNCG TGNCTCTCCT 
GTNCNTNCAC TGCCNCANAA NGGANGCNCN NNCTCCTATC TGTNTACAGN 
AAACNTNGCN CTNNCTCTAA GCTCNCCCAC TNTGTGGAAA GGCNATGTGT 
GCGTGCCTCT CCCCTATCAC GGCNGTTTGC NAAANGGGGA TGTNCTGCNC 
GGCGATGAAG TTNGGTCACT CCATGTTTCC CAGTCCNACC TGTTAGACNA 
AGNATTGNAN TGTGATACGA CTCNCTGTAA GGGGANTNGC GGACCCAGTA 
TGTTTGGCCC NACNNCCACT TCTTTAAATG GTGGCTAACG GCGCTTCCTA 
GNATAAACAC TATTGGTCCC CCCCTCTGCA GNACCCNTTA CTTCCGNANA 
AAAATTGTTG TCNTGATCCG CGACAACCAC ACCGTCTGTN GNTTTTAGTT 
GCAACNCNNA TCNCTCCAAA AAAGTTTCAG AAATCTTCAT TTTCCCNGGT 
TGAGCCCNTG ACAAACCCCT NAGGATTTGT CGAATGTAAA GTCTCCNGAT 
CTTCAATAAA NNTCCAAAAG NCTANCGAT 

FIG. 27 

PSGen 26-Novel 

TCACTGGGCN NNNTGGTNGN CGTCATGCNN NAGGTTCCNN CCCCCNNANG 
AACCTCCNGG TAATCTACAC NGGAGTCTTA AGTNGACAAN CCCACACTGC 
GANGGTCAAG NGGATCACCA TCNCCNCCTC CCAAGCTTNT NCATTGATGC 
TCTCTCTGTT CCGTNCCCTG CCGCTACACA TGGANGCTCT TNCTCCTTNT 
CTCNTCTTAC NANNCAAACA TTGCCCTNTC TCATA 



FIG. 28 

PSGen 27-Novel 

GGGAANGGGA NNAAAAAGGA ATTTTTTNGG GGGGGGNTTN TCTGGGAAAN 
TTTTTTTTTT TTTTTGGNAA AAANGGGGGG GGAAANAANC CGNTTTTCCC 
NAAAACNGGG GGGAACNGGC CGGGGGGGGA AAAAAAAGGG TTACNAAGGG 
AAACCTTTNA AANNGGAANG GNTTTGCNNC CCTNTNGAAA NNTTTGCCCC 
CCNNNAGGAA TCCCNGGNNA AACCCAANNC CNNCNCNCNG GGGGNCNNTN 
CNANGGGACC CCAACNCGGG CCCNAACTNG GGGNAAANAN GGGCAAAACN 
GGTNCCCGGG GNAAAANGGT ANCCCCTC 



WO 99/43844 



PCT/US99/04323 



12/23 



FIG. 29 

PSGen 28-Nc 

TGCCGAGCTG 
CAGGGTCTCC 
TTATTGTGTT 
TTTTTTTTTT 



GGGGTGAAGC 
AAGATCCCAA 
TTTTTTTTTT 
TTTTTTTTTT 



ACCGGAAAAC 
ACCCAAAAGC 
TTTTTTTTTT 
TTTTTTTTTT 



AACCGATCCA 
CACATTGTTA 
TTTTTTTTTT 
TTTTGGCAGC 



TCTCTTATCA 
ATTAGCCTTT 
TTTTTTTTTT 
TCGGCA 



FIG. 30 

PSGen 29-Novel 



T?r^CGCT GATTTTTACG AACAT TACCT GGCAGGGAAA TTTGATAAGT 
ATCCACTGTG gStGGCGCAC TACCTGGTAA AAGACAAACC CCGTGTGAAA 
Ar^CTGGl CTTTTTGGCA ACACAACGAA ACCGGCCACG TGAATGGCAT 
rCGGTCTTAT GTGGACTTCA ATGTTTTCAA CGGGGACAGC ACAGATTTTG 

ccgaIctatt aatIaaataa tgcagaattt cgcttttcaa ataagcccat 

GGATCCTGAC GT AAAAT AT T TCCTGCTGGT ^CGTGCAG ^CATTTCGA 

TGCTCATACT ttggctgatg ctcaacatga cctttgggat ctattttaat 

TTTGCTTTCC CCGACAATGG TTTGACGCTT G GC AACAT C A TTTATTACC. 
CTTCCTGCTG GGCAGCTCGG CA 



FIG. 31 

PEGen 32-Novel 



^^CVCTGMGTGGG GACGAAGCCC GAGTCCGTCC TGACATGTTT 
rCAGTGGAAA AGATTTTGTT NTGAGCGTTN CTTTCTNNTT ™TTTTNNHT 
TGNTTGTTNN ATGTTTTTGT TGTTGTTTTN TTNAAACTGT NTGTTGNCAN 
TTCAA™ A^GGNAGGNA ANTNTGTGNC TNCNTTGCAN ™CATGN 
TNCCCANANC CCAAAAAAAA AAAAAAAAAA AAAAAGAGTA CAAATATCAC 
IaaXtttgac ATTTTTGTAA TAATACTTTG GTTGTTGTTT GGTGACGGCG 



ATTG 
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FIG. 34A 
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FIG. 35A 

PSGen 12 cDHA Sequence 

GCGGTGGTGA CGGTAGTATG GCCGCACTTT ATGGTGGCGT GGAAGGGGGA 
GGCACACGGT CCAAAGTCCT TTTACTTTCT GAGGATGGGC AGATC CTGGC 
AGAAGCAGAT GGACTGAGCA CAAATCACTG GCTGATTGGC ACAGGTACCT 
GTGTGGAGAG GATCAATGAG ATGGTGGACA GGGCTAAACG GAAGGCTGGA 
GTGGATCCTC TGGTACCCCT TCGAAGCCTG GGCTTGTCCC TGAGTGGTGG 
GGAGCAGGAG GATGCAGTGA GGCTCCTGAT GGAGGAGTTG AGGGACCGAT 
TTCCCTACCT GAGTGAAAGT TACTTCATCA CCACTGATGC AGCAGGTTCC 
ATCGCCACAG CTACACCGGA TGGTGGGATT GTGCTCATCT CTGGAACAGG 
CTCCAACTGT AGGCTTATCA ACCCTGATGG CTCTGAGAGT GGCTGTGGTG 
GCTGGGGCCA CATGATGGGA GACGAGGGAT CAGCCTACTG GATTGCACAC 
CAAGCTGTGA AAATTGTGTT TGACTCCATT GACAACCTGG AAGCAGCTCC 
TCATGATATT GGCCATGTCA AGCAGGCCAT GTTCAACTAC TTCCAGGTGC 
CAGATCGGCT AGGAATCCTC ACTCACTTGT ATAGGGACTT TGATAAGTCC 
AAGTTTGCTG GATTTTGTCA GAAAATTGCA GAAGGTGCAC AGCAGGGAGA 
CCCTCTTTCC AGGTTCATCT TCAGAAAGGC TGGGGAGATG CTGGGCAGAC 
ACGTTGTGGC AGTATTGCCA GAGATTGACC CAGTTTTGTT CCAAGGGGAG 
CTTGGCCTCC CCATTCTGTG TGTGGGCTCA GTGTGGAAGA GCTGGGAGCT 
ACTGAAGGAA GGCTTTCTCC TGGCACTGAC GCAGGGCCGA GAGCAACAGG 
CACAGAACTC CTTCTCCAGT TTCACCCTGA TGAAGTTGAG GCACTCTTCT 
GCACTGGGAG GGGCCAGCCT GGGGGCCAGG CACATTGGAC ACCACCTTCC 
CATGGACTAC AGCGTCAATG CCATTGCCTT CTATTCCTAT ACCTTCTAGG 
GGCTGCCCCT CTTCCCATTC AGCCAACACT GAGTGTTGGG AGATTTCTCT 
TTTTTAAAAA CACATGAGAA AATAAATGCA CTTTACTCCC TCCCCAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAA 

PSGen 12 Protein Sequence 

GGDGSMAALY GGVEGGGTRS KVLLLSEDGQ I LAEADGLST NHWL I GTGTC 
VERINEMVDR AKRKAGVDPL VPLRSLGLSL SGGEQEDAVR LLMEELRDRF 
PYLSESYFIT TDAAGS I ATA TPDGGIVLIS GTGSNCRLIN PDGSESGCGG 
WGHMMGDEGS AYWIAHQAVK IVFDSIDNLE AAPHDIGHVK QAMFNYFQVP 
DRLGILTHLY RDFDKSKFAG FCQKIAEGAQ QGDPLSRFIF RKAGEMLGRH 
WAVLPEIDP VLFQGELGLP ILCVGSVWKS WELLKEGFLL ALTQGREQQA 
QNSFSSFTIiM KLRHSSALGG ASLGARHIGH HLPMDYSVNA IAFYSYTF- 
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FIG. 35B 



PSGen 13 cDNA Sequence 

GGCACGAGCT CTCCTCGTCC CCTCCCTTCT CCACTGCAGC CTTTCTCTTA 
GCCCGAACCA CTTCCTTCTT CTGCTTGTTC CTCCCTAGGG CGCGGAAGCT 
GAGTGCAGGG TTCAGACCCA CGCGGCGAGC AGCTCTTCAG TGAAGAAGGA 
AGCAATCGGA GGGTCAGCAA TGAACGTGGA GCATGAGGTT AACCTCCTGG 
TGGAGGAAAT TCATCGTCTG GGTTCCAAAA ATGCCGATGG GAAACTGAGT 
GTGAAGTTTG GGGTCCTCTT CCAAGACGAC AGATGTGCCA ATCTCTTTGA 
AACCGTTGGT GGGAACTCTG AAAGCCCGCA AAACGAAGGA AGATTGTTAC 
GTACGCAGAA GAGCTGCTTT TGCAAGGTGT TCATGATGAT GTTGACATTG 
T ATTGC TGC A AGATTAATGT GGTTTGCAGA TCTGGGGGTA TCTGGTAAAC 
TGGAATAATT AAGTTAAAGG ACAAACATGA AGTTCCTTAT GTATTTTTAT 
AGACCTTTGT AAACAAAAGG GGACTTGTTG AGAAGTCCTG TTTTTATACC 
TTGGAGCAAA ACATTACAAT GTAAAAATAA ACAAAACCTG TTATTTTTTT 
TTTCTTAAGA AGGTAATCGG GAGACGTAGG CAATAAAATG TTTTCAGAGG 
TGCGAAAAAG CTTTTGTTTT CTTAAACCAT TCTTAGTCTC TGCCACACTT 
GACACTCCGT CAAAGTGAGA AGCGAACTAA AGACCAACTG CGGTGGAAAA 
TATTATGTTT ATGTAATAAA AAAAAATCAT GTAAAAAAAA AAAAAAAAAA 



PSGen 13 Protein Sequence 



MNVEHEVNLL VEEIHRLGSK NADGKLSVKF GVLFQDDRCA NLFETVGGNS 
ESPQNEGRLL RTQKSCFCKV FMMMLTLYCC KINWCRSGG IW- 
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FIG. 35C 

PEGen 28 cDNA Sequence 

GTGTGGTGTG TCTCTCAGAC GTCCGTGACA CTTTGATCCT GCCCTGCCGG 
CACCTGTGCC TCTGCAACAC CTGTGCAGAC ACCCTGCGCT ACCAGGCCAA 
CAACTGCCCC ATCTGCCGGC TGCCCTTCCG GGCACTGCTT CAGATCCGAG 
CCATGAGGAA AAAATTGGGC CCTCTGTCTC CAAGCAGCTT TAACCCCATC 
ATCTCTTCCC AGACTTCGGA CTCTGAGGAA CATTCATCCT CAGAGAACAT 
CCCTGCGGGC TATGAAGTGG TGTCTCTCCT GGAGGCCCTC AATGGGCCCC 
TCACCTCATC CCCAGCGGTG CCTCCCCTTC ACGTTCTTGG AGATGGCCAC 
CTCTCAGGAA TGCTGCCGTC CTATGGCAGT GATGGCCACC TGCCCCCTGT 
TAGGACACTG TCCCCCCTTG ACCACCTGTC TGATTGCAAC AGCCAAGGGC 
TCAAACTCAA CAAGTCTCTC TCCAAGTCCA TTTCCCAGAA TTCTTCTGTG 
CTTCACGAAG AGGAAGATGA GCGCTCTTGC AGTGAGTCAG ACACTCAGCT 
CTCTCAGAGG CTGTCAGCCC AGCATCCTGA AGAGGGACCT GATGTGACTC 
CAGAGAGTGA GAACCTCACG CTGTCCTCCT CAGGGGCTGT TGACCAGTCA 
TNTTGCACAG GGACTCCGCT CTCTTCCACC ATCTCCTCCC CAGAAGACCC 
AGCCAGCAGC AGCCTGGCCC AGTCAGTCAT GTCCATGGCC TCCTCCCAGA 
TCAGCACTGA CACCGTGTCC TCCATGTCTG GCTCCTACAT TGCACCTGGC 
ACAGAAGAAG AAGGAGAGGC CCCACCTTCC CCCCGAGCTG CTAGCAGGGC 
CCCTTCAGAA GAGGAGGAGA CCCCAGCAGA GTCCCCAGAC AGCAATTTTG 
CTGGCCTTCC AGCTGGAGAG CAGGATGCAG AGGGAAATGA TATCATGGAG 
GAAGAGGACA GATCCCCTGT GCAAGAAGAT GGCCAGAGGA CATGCGCATT 
TCTAGGCATG GAGTGTGACA ATAACAATGA CTTTGACGTC GCGAGCGTGA 
AAGCACTGGA CAATAAGCTG TGCTCTGAGG TCTGCTTACC CGGTACCTGG 
CAACATGATG CCGCCATTAT CAACCGTCAC AATACCCAGC GCCGGCGACT 
ATCACCCAGC AGCCTGGAGG ACCCTGAGGA GGACAGGCCT TGCGTATGGG 
ATCCTTTGGC TGTCTGAGGG CACTGGCACC TGTACCTGGG CTTCCCCTCC 
TGTCCGCCTT CCATCTGTCC TCACTGGACC ACAGGCCTTC TGGGCATCTT 
CAACAAGACA CGTGGACTTT CTACTCTCAT GAAGGGAGGA CAGTGCAACC 
CTCCACCAAC TTCATCTCCT GTAACCATGA TTCTTACCCT CTCAGAAAGT 
ACCAGAAGCC TTCCTCCTGT GGGCTGATGT GTGCCAGCCA AACCCAGTGG 
GTCAGCTGAG CTGAGGGTCA GGGCTGGTTG TTTCTGTAGC CTTTTCTCTT 
CCAAATGGAG ACCAACGAGA AANAAAAAAA AAAAAAAA 

PBGen 28 Protein Sequence 

WCLSDVRDT LILPCRHLCL CNTCADTLRY QANNCPICRL PFRALLQIRA 
MRKKLGPLSP SSFNPIISSQ TSDSEEHSSS ENIPAGYEW SLLEALNGPL 
TSSPAVPPLH VLGDGHLSGM LPSYGSDGHL PPVRTLSPLD HLSDCNSQGL 
KLNKSLSKSI SQNSSVLHEE EDERSCSESD TQLSQRLSAQ HPEEGPDVTP 
ESENLTLSSS GAVDQSXCTG TPLSSTISSP EDPASSSLAQ SVMSMASSQI 
STDTVSSMSG SYIAPGTEEE GEAPPSPRAA SRAPSEEEET PAESPDSNFA 
GLPAGEQDAE GND IMEEEDR SPVQEDGQRT CAFLGMECDN NNDFDVASVK 
ALDNKLCSEV CLPGTWQHDA AIINRHNTQR RRLSPSSLED PEEDRPCVWD 
PLAV- 
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FIG. 35D 

PEGen 32 cDNA Sequence 

GGCACGAGGC GCCGCCTTCC TGCTCGCGCC CTATCGCCGC CTTCCTGCTC 
GCGCCCTATC GCCGCCTCCG AGTCTTCCTG CGCCCCGGGC TTCCGCCGCT 
TCATTGATTT CCGTTTCTCG CCGCTGCAGC CTCCTGACAC GGTGATCCGG 
GCGGGCCCCG CAGGAATTTT ATCCCCTCAC CGGCCTCACA CTAGTGTCGC 
ATGTCCACTA TCCAGAACCT CCAATCTTTC GACCCCTTTG CTGATGCAAC 
TAAGGGCGAC GACTTACTCC CGGCAGGGAC TGAGGACTAC ATTCATATAA 
GAATCCAGCA GCGGAACGGC AGGAAGACGC TGACCACTGT GCAGGGCATT 
GCGGACGATT ATGACAAAAA GAAACTTGTG AAAGCTTTCA AAAAGAAATT 
CGCCTGTAAT GGGACTGTGA TTGAACACCC TGAGTACGGA GAGGTCATTC 
AGCTTCAAGG CGACCAAAGG AAGAACATTT GCCAGTTTCT TTTGGAGGTT 
GGCATCGTCA AGGAGGAGCA GCTGAAGGTT CACGGATTCT AAGATGAACC 
CGAACATGTG GCGAGTTTCT TAAATGGTTT TGTTGTCTAA CTCAGTTTGG 
CTGCCTCGGG AGATGATTCT TTACAGTAAA CGACAGACTT TGCGTTTATT 
AAATCATTCA GACTTCCACT CACGCCTGCA TGGCTACAGA AAACATGGGG 
TATGTAGGCT CCTAAGTCAC AAGGAAATCG CCGTGAGGTG GGGACGAAGC 
CCGAGTCCGT CCTGACATGT TTCCAGTGGA AAAGATTTTG TTC TGAGCGT 
TCATTTCTAG TTTATTTTCA CTTGATTGTT AAATGTTTTT GTTGTTGTTT 
TATTAAACCA TGTATGTTGC AGCTTAACAA TAAAGGAGGA AAGTCTGTGC 
GTCAAAAAAA AAAAAAAAAA AA 

PEGen 32 Protein Sequence 

MSTIQNLQSF DPFADATKGD DLLPAGTEDY IHIRIQQRNG RKTLTTVQGI 
ADDYDKKKLV KAFKKKFACN GTVIEHPEYG EVIQLQGDQR KNICQFLLEV 
GIVKEEQLKV HGF- 
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FIG. 35E 

PEGen 42 cDNA Sequence 

GGCGTTGCGA CGTGGACATG TCGGCGTCGT TGGTCCGCGC CACCGTGCGG 
GCCGTGAGCA AGAGAAAACT GCAACCCACG CGGGCGGCGC TCACGCTGAC 
CCCCTCTGCT GTGAACAAGA TAAAACAACT TCTTAAAGAC AAGCCTGAGC 
ATGTGGGTCT GAAAGTGGGT GTGCGGACCA GGGGCTGTAA CGGCCTCTCT 
TACAGCCTGG AGTATACAAA GACAAAAGGA GATGCTGATG AAGAAGTTAT 
TCAAGACGGA GTCCGAGTGT TCATCGAGAA GAAAGCCCAG CTAACCCTGT 
TAGGCACAGA GATGGACTAT GTGGAAGACA AACTGTCCAG TGAGTTTGTG 
TTCAACAACC CCAACATCAA GGGAACCTGT GGCTGCGGTG AAAGCTTTAA 
CGTCTGAAAG CTGAGGACTG CAAACTCCAG GAGAGCTGGG TCTGCCTTGG 
AGCACACCGA AGAAATCATG TGATGTCCCG TGTCGGAAGT TAGTGTGTGG 
CTGCCTCGTG GTTGAGAATA AAGTGAAGCA TTGAAAATCA AGCCAGCGTG 
TTAGAGTTCC AAAAACATGG TGTCTGTTCT CTGTAAGACA CAAATGGAGA 
GAACATGGTG TCTGTTCTCT GGAGGACACA AACTGAGAAA CTGTTGAGTC 
CTCTGTCCTG TACAGAAAAC TCCTACCCTG CCCTTACGCT GTAGCCTGCT 
CTGTGCTAGA ACCAGCTTCG TGACCATTGC TTTGCTGGGA ATTGAGGAAT 
GGGATAACGG GTGTGCACCT GGGTCACAGA ATGGCTTGAG ACTGTCTCCT 
GGCCCTGTCT CACCTCAGGC AGGGCAGCTG TGGGAGCAGC AGCTGTGGGA 
GCGGTGAGGG GACCTGGTTT CCCTCACCTG TGGCGTGGCC CGTTGCATCT 
TTACCACGTG CCTGTTGTCA GATACCTCAT TTGCCAGCCT CCAGCAAGCT 
CAGCTATGAG TGCCAGTCTC AGGAGGTAGG GATCACGGGC CTGGTGTCAG 
TCTGTCCTCT GGGGCGTGCT TCATGCGGTT TGCTTAGACC TTTCAGTTAG 
AAGCGCTTGT GATGAGCAGC CAGGTAGACC TGCTGAGAGC GTGGTTCTCA 
GAGCTTCTGC CCAGCCCTCC TCACAGGTCA CAGCAGACAG TGCTGTCTGA 
GACACTCGGT GAGGAGACAT CCTGCCTGGC CAGTGCTCTT ACCAGTTTAG 
AGACTGCATT AGTTTTCTCT TGAATGGAAG CCTTGTGTAA ACCCTTTTGT 
CTGAATGGCC ATCCTGTTTA GAGCTTTGAA CCAGTAGTGT CTTCCTTCAG 
AAGATCTGCA GCAGAGGGGT CCCTCTCAGC ACGGCACCTG GGGGGCAGAA 
CATGCACACA CTTACAGTTG CCAGGGTGCA GATGCTCCCT GCTTCCCAGA 
GGAAGCTTCT AAGTTTCTTT AATGTGGTCA TCACCAGTTT TTTGAGCCAT 
GGTTTTGCTG TATACTACAG GCCAGCCTTG AACCCACAAC AATCCTCCTG 
CTTCCACGTT CAGAGGCATG TGCTACCACA CCTGACCTGG ATCCCAAGTT 
TCTCTTTAAG TGGTCTTGAT GGACTTGGGT CGGACATCTT AGTGACCTGT 
GAATTCTTCT QTGGAGGCTG AGTCTCACGT AGCCGAGTTT AATATCTGTG 
CTATTTACTA AAGTATCTGC CACCAAATTG TACCAACTCA TAGTTTTATA 
TGAATGTTGA TGAGTCTGTA TCATAAATAG AATTGTTGAT ACATCCTTAA 
TTTGTGCAAT ATTGTATGAA GAAGATTGTT ATCAATTAAA ACCACGCCTC 
TTTATGATCC TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAA 

PEGen 42 Protein Sequence 

RCDVDMS AS L VRATVRAVSK RKLQPTRAAL TLTPSAVNKI KQLLKDKPEH 
VGLKVGVRTR GCNGLSYSLE YTKTKGDADE EVIQDGVRVF IEKKAQLTLL 
GTEMDYVEDK LSSEFVFNNP NIKGTCGCGE SFNV- 
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FIG. 35F 

PEGen 45 cDNA Sequence 

ACGAGCTGAA GGTCACTTCG CGCACGGGTT GGACCTGGGG CAGGTTGGAG 
GAGTAGGAGT ATGTCATTGG GCGCGAAGAC GGGGTCTGGG GCAAAAAAGA 
AGGGAGGCTG GAGAAATCTG GACCCGAGAC GTAGTAAGTA CAACTTGGCA 
AATACATGTT AGAGGAGCAG GGACCACGCT CATCAAAATC CATCATTGGG 
CTACCTTGGG CTCTCCGCAG TAGCCGAGCT TAACATGATT CTCCACTGCA 
GCTGCCTCTT TGAAGCGGAT CCGTGAAGTA GAAATTTGGA GACGTAAGCT 
GACGTGGAAA TCTATCCCCA TCCTTAGCAG GGAGGTGCTG GTCATGTGAC 
CCGATGTTGA AATTGACAAG CCGCGAGCTA GTCCCGGCTT TTTTTTTTTA 
ACCCCCCTCC CTTTCCTTTT TTCCCCCTCC CCTCCCTCCT CGGCTTCCTT 
TCTTTGTAGC CACCTCAGGG GAAGCAACAG ATCGTCACTC GGTGTTCTCA 
CCGAAAGCAC GTAATCGCCG GTGTAACTCA TGTTGGCTGG GGGGCCTCCC 
CGCTCGCAGA AAGGCTGGGG TGCGCCCCCA AGCAGCTTTC CTTTGCTCAG 
CTGCATGGTC CTGGTCCACG AGCGCTCTGA GGGCGGCAAG AGAGCGCAAC 
TCCTGACGCC TCCCCCCACT CCCCGGTGGG TGAGGGATGC TCTGGGATGG 
GGGTGGC C AG GTGAACGCCC GGAATTGTGT AGCTTCAGGT TCCGGAGTCT 
GTTGTCCGAA GGCTTACGTT CAGCACCTTC TTCGCAGTCC CCCTCCCACA 
GACTTGCTCT GGAAAGC AC C TCAGTCTCAG AATCTGGCTG GACCCCATTT 
GGGGCCAGGC TTCGCAGCCA CGATGTGCCG GGCTTCGTGG CTTGTCCGAT 
TTGCACGGTG ACTTGATTAC ACGCTCTCAT TCATGGTCAC TTCCGAAGCG 
CTTTAGTGCC TTCCGTCCCC AAACCGCCAA CAGGCAAAGC GGCTTTCCTC 
CGCGGTTTGT CAATAATCCG CGCTGTCCGG AAGGGCTTCG CCTTACCCGG 
GTTCCACCTT CCCTGTATCT TTCTGCTTAC TTCCTCATCC CACACTCTGT 
CCTTGGAGGA ACCCCTTCTC CTCGCTGCCT GTAGGGGTTC GGAGTGACTC 
CACAGAGCCA GAGGCGCTTC TGCTCACCGG TCCGCAAGCT GCCTGGTCTG 
CTGAAGCTGA CGAATCGGGA AACCATGCAA TTGAGGCGAA CCTTGGGCTG 
CTTTAGAGGC GCTGAGGAGC CTTCTCCTGG GAGGCCCAAG GTCGATTTCA 
GCCCACCAGG ATCTGGGGAA GACCCAACTA GGGGTAAGAG CACACCGGAA 
GGCCAAGTCC GAGTTCCAGT CCTAGAAGAG GCGGCTGCGG GCAAGGTTAT 
GACATTGGCC CTGGACACTG GTTTCCCAGG AGCTATTCTT TCTCAAGAAC 
TCCACAGCAC GGGGCTGTCT CCAGAAAATA CTCTTCAACG TTTATTTCCT 
TTAATCGTCA ACCCGCAGCC CTACGGCGGT TAATGCGAGA GGCCAAAAAT 
GTTTGGAGGA AGAAAAACAA AGGCAGGAAG TGGCCGCGGC CTGACGGTGC 
GTGTGTGTCT GTAAAGAAGG GAGGGAGCCG GTTCAATCTC TTCTTTTTTT 
CCCCGAATTT CAAGGTTTAG GCAGACCCCC GTAGGGCCTG GCCGAGGCTC 
ACCCGGCGGA GCATTTGGAG GTGGCCAATG AGTAAGGCTC GTCGGGCTGA 
AAGGCTAAGA AGGAGATTTG ATCGGCAGAA CAAACCAAGC CTTTTTGGAG 
GTTTCTTCTG ATTTGGTCCT AAAGGGTATA TGCTAGTGTC CACAGCGGCT 
CCTGTGGCTG CTGTTTTCCT CCTGTCGGAC TAAATGTACC AAGAAGGGAG 
AGAGATTGAG GCACCTTGCG CGCTCCTCTC TCCTTCCGAG GTAGAATATC 
AGAATAAAGT GTATTCAGGT GCCAA 
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FIG. 35G-1 

PEGen 50 cDNA Sequence 
A: 

ATCGGGCTGT ACTAACAGAT TGTTTGTAAA CAGTGACACA GTGATAACTT 
CCGTGTTACT TCTTAACTTT ATGTTTCTGC TTTCAGATCT CCCTCCCCTT 
CCAGAGGAAG TTAGCGATGC CATAGCTTTA ATGTCTGTTT TAGCTGCAAA 
AC TC ATTGTT CACTTTCTGT TAGAAAATCT AAAGCAGGTG GTATGCAATT 
TCTCTTGATT TGGAATTCTT TAAAGGCAAG TAAATTTGGA ACTCCTGTGT 
TGGGGGGTTA ACGGAGGTAG GAACCCAATG GTGTGTCCCT AGGTCGTCCC 
CGTTCTCGGA TAGCACAGTC TGCATAGCCA TAGCTCTCAA TTATGTCACT 
ACCCTAATCA TCGCAGCCCG GTTCTCACGG ACTCTTTGAA GTCCCAAAAT 
GACTTTTGTT TGATCCTGAT TTGGATTTTC AATGGAAAGT AAAAGCTTGG 
GGTGAGGAAG CAGCAGCTAA AGCAGGGAGT TGAGCCAGTG AATTGCTGAC 
GGAAAGGATT CTGGTCTTGG AGGAGGGGGA CCTGAAGCAG AAGGAAAAGG 
GATCCTTCGC TTAAGTTCTT AGGAAAAATC TTGACTCAGA ATCCCAAGAT 
TTTTCCCTTC ATCCCAGCCG GGTAAATATT TGGTTTTGTC TTTTAAGTAT 
AGCATGAAGC CCGTGGATGA GAGCCATGTG TTGTAGGATT CTCTTCCCTA 
TTGGCTCTGA GCTTGTGTCA CCGTATCAGT TTGCTCCCTA CAAAGGGACC 
TAGTTTGGAA AGGATTGGAA GGGCAACTGT TCAGCGGCAA TGGAACACCC 
AAACGTGGAC TGGGACAACG GGATTCTGAT AAAGGGAAAT TTCTGGTCTG 
GTCCTGGCTG TGTCATAGCT CTTTATGTGT GCATGGAGAG CTCTTGATCC 
AAGTAGAATA TGTAACAATA CAGACCAGGA TCTTCCAGTC AGTACTGCTG 
GGTGGAAGTG GGCGGGTGAT GGTAGTTGCT AGAAGAATCA TTAAGACAGC 
ATCTGCGGTG AATGCGTCCC AAAGCCTCGC GGCATCAGTT TCATCTCTAA 
ACCATTAGCT TACAGTTGAT TCCGTTTCCT GGGACAGAGA AACATCCCCA 
CGCGAAGTGA CTGTGTTGTG TATTCATAGC ACTGCAAATA AATTCACGCG 
CCATGATGAA ACCTTGCAAA TACGCTTTGA CCAAAAAAAA AAAAAA 



nuienrvirv ^\Air~\ ortAiOA a a * i „ 



WO 99/43844 



PCT/US99/04323 



23/23 



FIG. 35G-2 

B: 

GGGTGTGGGG CAGCTGGGTG GGAGCAGCGT GCAGGCTACC AGCACCAAGT 
GGTGTGCCTC TCCGGGGGTG TGTGCAGAAG GCTCCTGGGG AAAAC TGC AC 
AGGTACCACC CCTAGACAGA AATCGAAAAC CCACTTCTCT CGGTGCCCCA 
AGCAATACAA GCATTACTGC ATCCATGGGA GATGCCGCTT CGTGATGGAC 
GAACAAACTC CCTCCTGCAT C TGTGAGAT A GGCTACTTTG GGGCCCGGTG 
TGAGCAGGTG GACCTGTTTT ATCTCCAGCA GGACAGGGGG CAGATCCTGG 
TGGTCTGCTT GATAGGCGTC ATGGTGCTGT TCATCATTTT AGTCATTGGC 
GTCTTGCACC TGCTGTCATC CTCTTCGGAA ACATCGCAAA AAGAAGAAGG 
AAGAGAAAAT GGAAACTTTG AGTAAAGATA AAACTCCCAT AAGTGAAGAT 
ATTCAAGAGA CCAATATTGC TTAACTTAAT GATTATAAAG TTACCACAAG 
CTGATGGCGA GCTCCAAAAG ACCTGACTCA TTTGCAGATG GACAGGACAT 
GTCTCAGGAA AACAGCTTGC AGAAATGAAT GTTTAAATAT TGTATTTGCT 
TTTTCATTTT ATTTGTAACT GTGTGTTGTT ATTGTTTTTA ATAATGATAT 
TTTTGTTACA GTCTGATAGC TGAGAAAAAA ATGACCTGGT TAGGTGACGA 
CAATAAGGGA CATTGAATAT AAACTTTGTT GCTAGGATTA TTAAACAAAC 
AAAATTTGGA AAGAAGTTAG ATTTTAAGAA CTGAGTCATG GTCAGGCAGC 
GATGGCACAC ATCTTTAATC CCAGCACTTG GGAGCAGAGG CAGGTAGATC 
TCTGGGAGTT TGAGGTCAGC CTGGTCTACA AAGCAAGATC CAGGGTAGCC 
AAGGTTATAT AGAGAAACCC TGTCTCACAA AACCAAACCA ACCAATCAAC 
CAAACAGCAA AACACCTGAG TCGATAAAAG GGCTCCCCAG GTTTATACAC 
TTAC CGTATG CTAAGAGCTT GAAATATATT GTTTCGTTTT ATCGTTCAGT 
AGTCTGTGAG ATTGCATTTT TTCTCATTCC TATATATAAA AAAGTTAAAT 
GATTTCCCTT' AGATGTAGAG ATAGAGGAAG TTAGCGATGC CATAGCTTT 

FIG. 36 

PSGen 2 7 -Novel 

NTCNNCTTNN CNNNGGCTGA TATCNGGCNC TTCNTCCNCG ATCNCAGATA 
CNNGCNCACC GGNNNTHTCN GNGGTNATCN TCCNCCATCT CTCNTCCCCG 
ACNTGCACTC CGGGTNTNNT ACACNGGACA CTGTATCNNA CAGNAAACCT 
NCCCNGGCCC CAGGGATCAC CATNCCTCGN CCCNGCNTGT NTATAANATC 
AGGNNNTACA TCNANGAACN NACTATCACN GNTCTCTNTT NNCTCAGTGT 
NCACCTTCCA CTNCNGAANC TNNTCGCTNC NCCNCNGTTG GGAAAGGCGA 
NCNGTNCCGG CNACATGCCG TTTNCGNCNT CTGNNCACNT GGGGATCTNC 
TNCAANGNAA TCAATTNGNG TAACCCACGG TTTNCNCAAT CACTACTTCT 
CANNCNANGG CCNTTGAANT GTTATCCCAC CACCANGGGG CNANTCGGGA 
CCTNACAATT CATCCTCAGC CGGCCCCAGN CTTAAAAAAT TCAAAGGNCN 
CTTGCCCGCN TTNTTNCCTT AGCCCGCCNC CNGACAACAN CCNANNAACA 
ACCCCCNHTC TTANGTTGCN NANCCCACAG GANNTTGNNA TACCGGGTTT 
CCCCNGAAAC TNCTCAANGC CNCCGTTCCA ACCCCCGTTA CGAAACCGTN 
CCCNTTTCCT TCCGAGNTTG CCTATTAANN CCCCCNAAGT TCTNCTTCGT 
TNGNTTCCTC CGAAANG 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: The Trustees of Columbia University in the City of New York 

(ii) TITLE OF INVENTION: RECIPROCAL SUBTRACTION DIFFERENTIAL 
DISPLAY 



10 



40 



(iii) NUMBER OF SEQUENCES: 24 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Cooper & Dunham LLP 

3^5 (B) STREET: 1185 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP: 10036 

20 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

2 5 (D) SOFTWARE: Patentln Release #1.0/ Version #1.30 

(vi> CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
30 (C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: White, John P. 

(B) REGISTRATION NUMBER: 28,678 

35 (C) REFERENCE/ DOCKET NUMBER: 55551-C-PCT/ JFW/AKC 

(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (212) 278-0400 

(B) TELEFAX: (212) 391-0525 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 371 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : not relevant 
_(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

5 TAAANCGGTG GTACTGCTGC ACGGTCCTCC GGGT ACT GGA AAGACATCCC TTT GTAAGGC 60 

ATTAGCCCAG AAACTGACCA TCAGACTGTC AANCAGGTAC CGGTATGGCC AGTTAATTGA 12 0 

AATAAACAGC CACAGC CT AT TTTCTAAGTG GTNTT CAGAA AGT GGCAAGT TGGTAACTAA 18 0 

GATGTTCCAG AAGATTCANG ACTT GATTGA TGATAANNAA NCTTTGGTGT TT GT C CT GAT 2 40 

TGATGANGTA AGCACTCANN GGTACTCATT CTTNGTCTGC ATTGCCTCTT GCT ATT ACT G 300 

15 CCTGATCCCT CTCATTTGGT TCACTGTGTC GCNANCTCTT TTCTATGGAT CTTTTCCNAN 3 60 

CCACCCGTTT C 37 1 

2 0 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 245 base pairs 
<B) TYPE: nucleic acid 

2 5 (C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

3 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GTGACGTAGG GTCTGTTGCG TCAATGGTTA TAGCAAGTGA T GCT CT CT GA TTATTACTGC 60 

T GACAAT ACT CGGC CAACAA TTCTTGCATA GAGTGCTGAT AAATAACTAT GTTACAAAAA 12 0 

35 

GGGGTGGTCC C T GGAGAAC A TTACAGGCTT CCCTAGGTAA GTGTGCAGGT CAGGAGACGG 180 

CAT ATT CAAT CAGATGGCTG ATAGTTCTCC GTGGTTATGC ACCGGCTCCA GCTTGCCTAC 2 40 

4 0 GTCAC 



(2) INFORMATION FOR SEQ ID NO: 3: 

4 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 



245 
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15 



35 



45 



<D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCAGCATGAT GAATTTAATG CAAC AGT CAT AGCAGGGCAA GGGGAGAGAA AGGCAGATGG 60 
ACTATCTGCA T CAT CAAGCG AGGGCTTGTG TCGGCGGCTA TGTGCAGAGA CGAGCAGGGC 12 0 

GAGGCACT T A AAAGCTGCTN GATGAAAATC CACCCAGGAG AANTCTGGGC CTACGTCA 178 

(2) INFORMATION FOR SEQ ID NO: 4 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

25 

TGACGTAGGC CCAGACTTCT CCTGGGTGGA TTTTCATCCA GCAGCTTTTA AGTGCCTCGC 60 
CCTGCTCGTC TCTGCACATA GCCGCCGACA CAAGCCCTCG CTT GATGATG CAGATAGTCC 12 0 

3 0 ATCTGCCTTT CTCTCCCCTT GCCCTGCTAT GACTGTTGCA TTAAATT CAT CATGCTGCCA 180 



AAAAAAAAAA A 



(2) INFORMATION FOR SEQ ID NO: 5: 



191 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GC CAT AAAT A CACTTTATTT CAT T C G AAAT GCATAATCAC ACTGGGAGCA CTCCCTTTGG 60 
AGCACTCCTC TAGCAGCAGG TCCGAAGTGC TCCAGCATCG TCAGCTGGCT CCAACACCTA 120 
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CGTC 

(2) INFORMATION FOR SEQ ID NO: 6: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) . STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

10 
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(ii) MOLECULE TYPE.:. cDNA 
(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 6: 
15 TTTTTTTTTT TTTGGAAACA GAAT AAAGT G CTTTATTCTC TGGCTGGCTC TCCTACGTCA 



20 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 216 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TCGGCGATAG CATTGGAGCA AGTCTTATCA GCAAGCAATG TTTTCAGTTA TGTTTCAAAG 
TTAAGAATGG GTTTAAACTT GCT GAACGT A AAGATT GACC CTCAAGTCAC T GT AGCTTTA 
GTACTT GCTT ATT GT ATT AG TTTANAT GCT AGCACC GCAT GT GCT CT GC A TATTCTGGTT 
TTATTAAAAT AAAAAGTTGA ACTGCAAAAA AAAAAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
45 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 



124 



60 
61 



60 
120 
180 
216 



10 



WO 99/43844 PCT/US99/04323 

-5- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TNGCCAGGCT ATGTCTCAGA 60 

5 CTTTATTATT ATTATTATTA TTATTAT TAT TATAAATAAA ACATGTNCTT TCAATTAGGT 12 0 

TACAANAGTA TTTATCTCCA TAACGCTTCT TCATACATCC TTAGTTTTGG ATTAAAGTAC 18 0 

CATCCACCCC AACT CAAACT GTAACCCCCA GTAATCCCCT CTAACGTGGA AATTTCTGGT 240 

TTAACAACTC AGTTAACTGC CCCACAAACA GTGGGAGGCC GCTCTTGCAT GGCTATGCCA 300 

CGTAACCCTT CACTGCTTCA CTTCTTCGCT GGCT 334 
15 (2) INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 6 base pairs 

(B) TYPE; nucleic acid 

2 0 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GACCGCTTGT ACCATCCAAC TTGCTTTGTC TTCTGCAGAG AGGAGGCTAA AGCCCTTGAG 60 

CTGGCTGGCA CTGTACTCAG GCCGGAAGCC CAGCTCGTCC CGGTTCTTGA CAAAGCAAGT 12 0 

30 

TGGATGGTAC AAGCGG 136 
(2) INFORMATION FOR SEQ ID NO: 10: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

4 5 TGCCGAGCTG GGT AT TGTGA CGGTTGATAA TGGCGGCATC ATGTT GCCAG GTACCGGGTA 60 

AGCAGACCTC AGAGCACAGC TTATTGTCCA GTGCTTTCAC GCTCGCGACG TC AAAGT CAT 12 0 
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TGTTATTGTC ACACTCCATG CCTAGAAATG CGCATGTCCT CTGGCCATCT TCTTGCACAG 
GGGATCTGTC CTCTTCCTCC AT GAT AT CAT TTCCCTCTGC ATCCTGCTCT CCAGCTGGAA 
GGCCAGCAAA ATTGCTGTCT GGGGACTCTG CTGGGGTCTC CTCCTCTTCT GAAGGGGCCC 
TGCTAGCAGC TCGGCA 

(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 337 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
20 AGGGGTCTTG AT GGACTT GG GTCGGACATC TTAGTGACCT GTGAATTCTT CTGTGGAGGC 

TGAGTCTCAC GTAGCCGAGT TTAATATCTG TGCTATTTAC TAAAGTATCT GCCAC CAAAT 
2 5 TGTACCAACT CATAGTTTTA TATGAATGTT GAT GAGTCT G TAT CAT AAAT AGAATTGTTG 

AT AC AT C CTT AATTTGTGCA AT AT T GT AT G AAGAAGATTG TTATCAATTA AAACCACGCC 
TCTTTATGAT CCTNNNAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AACCNCCTCA AATCCATNGG TTCTAACCCA AAACCCT 



(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
4 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
45 TTTTTTTTTT CATACACCAT CAAACCAATT TTATTTCTAT AGCAACGTTT CTCACGTCTG 

AACCT GAGAA TAAGTCACCA GCTCTTGACA GTAAACATGG GCCCTATCAA ATTATATTAG 
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ACTCCTCAGT GTCCCGCCAT GTGGCCTTGC ACCAAATCAA TTAGTTTGAG GGCCAAAATC 18 0 

CTGTTGGGTT TCAAATAAAG TGTCAGGTCA TAAGGAGGGG GAGGGACTCA ATT CAT GGGA 240 
5 ACATTTTTAC CTGTTCAAAT AGATAAACTG AATTGCCCTA TCTGTGGTCA CCTGGATCCA 300 

AGACCCT 307 
(2) INFORMATION FOR SEQ ID NO: 13: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 96 base pairs 
{B) TYPE: nucleic acid 
(C) STRANDEDNESS : not relevant 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

20 

CCCTGACGAT AAAT GGTAAG GAACTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTTNC 60 

GAAATAAACA AAC AC AG C T T ATTATTTGGG GGAACATTAA NTTCTATAAN T G AAC AC AAA 120 

25 ANAAAATTAA NANTTAATGG GGGGGTANAA GGGACTTTGA AT CT AT CT GG TATCATGACA 18 0 

T T GAAGC ANA NACCTGANTG ACCAGAAAGA GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA 2 40 

GAGAGGTTTC ATATGAGCTA GTGTTACAGG CTTTATTAGT CTATTAGTCA GGGACC 2 96 
(2) INFORMATION FOR SEQ ID NO: 14: 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 base pairs 
35 <B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
AATCGGGCTG GATGGGTGTA TCCGGCACTG TTTCGTAGCG GCAGCAACTG GGTGCTTCTA 60 
4 5 TCTGAAAGCG GGCTTCACAA AAACTACTGC GCCACCCGAC TCGCTGCGGC ATCGCCCGGT 120 

GGCGAGTACC GTAT CGCCTT TCCTGGTGCA GAAGAAGTGT TTACAGGAGG CGGTCATTTA 180 
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CCGCAATCTG ATT CTGTTTT TTATTCTCCC TGGCGGGTGA TCGCGATCGG CAGTTT GAAA 24 0 

ACGATCGTTG AATCCAC GCT CGGGAATGAT GTGGCTTCGC CGCCAACGCT TACT GACATT 300 
5 TCATTTGTAC AGCCCGATT 319 

(2) INFORMATION FOR SEQ ID NO: 15: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 287 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
20 GCCGAGCTGT GT AAAAC CAT CTATCCTCTG GCAGAT CTAC TTGCCAGGCC ACT C C CAGGG 60 

GGGGTAGACC CTCTAAAGCT TGAGATTTAT CTTACAGATG AAGACTT C G A GTTTGCACTC 12 0 

GACATGACCA GAGATGAATT CAACGCACTG CCCACCTGGA AGCAAATGAA CCTGAAGAAA 18 0 

GCGAAAGGCC TGTTCTGAGG GTGAGATGAC AGCCACAGAG AGGTCACTGC CACTAGACCA 240 
GAAAGT GGAT GGAGATATAT ATTT GGACTG GTGTTTTTTT CTGTCAG 28 7 



25 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 base pairs 
3 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
ATCGGGCTGC AGATT GGAGA CAAGAT CAT G CAGGTGAACG GCTGGGACAT GACCATGGTC 60 
4 5 ACT CAT GACC AGGCTCGGAA GCGGCTCACC AAACGTTCGG AGGAAGTGGT CCGCCTGCTG 120 

GTGACTCGGC AGTCTCTGCA GAAGGCCGTA CAGCAGTCCA T GCT GT CAT A GCTGTAGTCA 180 
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GCCTAGACTT CTGCCCACTG ACCTTTTNGG GCACTGAGAA CACATCCACG CTCTGTCTGT 2 40 

ATCTAGTTCT GGCTTCTGCT GTGTGCTANG CCCCAGCTCT GAGGAGTAAC AGCTGATCCC 300 
5 AAAGGTCCAA GCCAACCTTC TTACCCCTCA GCCCCCANCC CGAT 34 4 

(2) INFORMATION FOR SEQ ID NO: 17: 

10 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 inear 

15 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
2 0 TTTTTTTTTT TTTGGGCAAC TAT GTATTTA TTGTGTTTGG AAGGCAGAGT GAGGGAGGAG 60 

ACCCCAGCAG GAAGAAGACT GGGTGCAGTC TAGAGTT CCT AGTCAAGAGT AGGAAGGTTT 120 
CTGTTATACC CAT CAT AGAA CGAGAGAGGG GGCT CAAT AG ATCATCCCCT TTGTCTCTCC 180 
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ACGGGGCTTC TTGAGCTTCT CAAAGTTCTT CAGGAT GAT G T CAT AT AAC A CAGCATAAGC 240 
GTTAC GGATC TCCATGACCA TCAGCCGGAT CTCCTGGTAT TCCGCCTCGT CCAGCTCGGC 300 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 61 base pairs 
3 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
AANATCTGCT TAAAAGTTCT TTAATTTGTA CCATTTCTTC AAATAAAGAA TTTTGGTACA 60 
AATTAAAGAA CTTTTAAGCA GATGTTTTGG TGCAACTAAT AGAAAAGATA AAGGCAGCCT 120 



GACAT GCATG CACTGCCTCA GTGACCAGTA AAGTCACATG NCCTT GGGAC GTCAGCTTAG 



180 
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NT T T AT C AC N GTGTCCCAGG GGTGCTTGTC AAAGAGATAT T CTGC CAT GC CAGATTCAGG 2 40 

GGCTCCCATC TTGCGTAAGT TGGTCACGTG GTCACCCAGT TCTTTAATGG ATTTCACCTG 300 

5 CTCATTCAGG TAATGCGTCT CAATGAAGTC ACATAAGTGG GGATCATTCT TGTCAGTAGC 3 60 

CAGTTTGTGA AGTTCCAGTA GTGACTGATT CACACT CTTT TCCAAGTGCA GTGCACACTC 420 

CATTGCATTC AGCGCGCTCT CCCAGTCATC ACGGTCACNT A 4 61 
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(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 280 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
2 5 TGAC GTAGGG CCGAGAGCAA CAAGCACAGA ACTCCTTCTC CAGTTTCACC CTGATGAAGT 

TGAGGCACTC TTCTGCACTG GGAGGGGCCA GCCTGGGGGC CAGGCACATT GGACACCACC 12 0 

TTCCCATGGA CTACAGCGTC AATGCCATTG CCTTCTATTC CTATACCTTC TAGGGGCTGC 18 0 

CCCTCTTCCC ATTCAGCCAA CACTGAGTGT TGGGAGATTT CTCTTTTTTA AAAACACATG 2 40 

AGAAAATAAA TGCACTTTAC TCCCTCCCCA AAAAAAAAAA 280 



(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 177 base pairs 
4 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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GTAGGCAATA AAATGTTTTC AGAGGTGCGA AAAAGCTTTT GTTTTCTTAA ACCATTCTTA 60 
GTCTCTGCCA CACTTGACAC TCCGTCAAAG TGAGAAGCGA AC T AAAGAC C AACTGCGGTG 12 0 

5 GAAAATATTA TGTTTATGTA ATAAAAAAAA ATCATGTAAC TGCAAAAAAA AAAAAAA 177 

(2) INFORMATION FOR SEQ ID NO: 21: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



15 
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(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TGCCGAGCTG AAAACATACA TCCGCACCGG GTT GAGAT AG CTGGCCCTCC GTCCCCGGGC 60 

AT ACTCTTT G GAT AAGAAC C CCGGCCTTGT T AC CAGGT AC CGGAGT GAGC TGAAAAATTT 120 

25 AC C GT CGAAA TGGGTGATGT CCTGGAAAAA ATGGTTCACC AGCTGCCAGG CAGATTCTTT 18 0 

GGGTT C CACA TTTTCCTGCC CACAGATGTG GCAGAAGCGG TCAAGTAATG CAGCATTACA 2 40 

ATT GAGGC AG ATCTTTTCTT TTCTTTCCTT GGAGTGGCTC AACCAGCGAT TTTGGTTAAA 300 

30 

AATAATCAAA AAAGC GACGG CAAAACTTTT GTTATATTCC CGCCTGTGGC ATTTGAACTG 3 60 

TGCCCGGCAA CCGAATAACT TTTAATTTTG AAAATAAAAT GCATACTAGA TTTTTAGCGG 420 

35 TTGCCTCCTG GCCATTGCTT CAGGCGCCNG CACAGCGTCA GCCCAGTTTT AC CACNANGA 480 

ATATCCTAAG CGTTGAAACA GGGCACAGCC GAAAAAAACN CTGGCNACAA AAAANATCCG 540 

GAC AT CCTTT TTCCAATTTT GAAAC C GAAN GCNCGCAAAC NAAGGTT CTT CGGGAAAAAA 600 
AATCGCCAAA ATACNC GANA TCAAACTNTC CAA 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 base pairs 

(B) TYPE: nucleic acid 



633 
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(C) STRANDEDNESS: not relevant 

( D ) TOPOLOGY : 1 ine ar 
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(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TGCCGAGCTG GGGGGAGTTC CAGGAATTTG TGGACTATTT CCAGGAGGAA TTGAGGAATC 60 



TAGAAGTAAT AAGAACTT C A CAAGTAGAAC AACAGAGTTA ATTGAC CTCT AT C CTTAAGA 120 
GTTACCAGAG AATTATTAAA AAACTAAAGA ACAATCAAAG CCTGGTCCTG TGCCACCACC 180 
15 C AAAAAC AT G TATAGCCTAT GTGCAGCTCG GCA 213 

(2) INFORMATION FOR SEQ ID NO: 23: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 679 base pairs 

(B) TYPE: nucleic acid 

<C) STRANDEDNESS: not relevant 
( D ) TOPOLOGY : 1 ine ar 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTCANAGGGC NNNTTNGNGG NCNTCATGCN CCAGGNTCCN NCCCCCANAN GANCNNC CNG 60 

GTAAACTACA CNGGAGTACT TAAGTGGACA NNCCACATGC GANGGNCAAG GGGAT CACCN 120 

3 5 TCNCTCCTNC AGNCTNTNCG TGNCTCTCCT GTNCNTNCAC TGCCNCANAA NGGANGCNC N 180 

NNCTCCTATC TGTNTACAGN AAACNTNGCN CTNNCTCTAA GCTCNCCCAC TNTGT GGAAA 240 

GGCNAT GTGT GCGTGCCTCT CCCCTATCAC GGCNGTTTGC NAAANGGGGA TGTNCTGCNC 300 

40 

GGCGATGAAG TTNGGTCACT CCATGTTTCC CAGTCCNACC TGTTAGACNA AGNATTGNAN 360 

T GT GAT AC G A CTCNCTGTAA GGGGANTNGC GGACCCAGTA TGTTTGGCCC NACNNCCACT 420 

4 5 TCTTTAAATG GTGGCTAACG GCGCTTCCTA GNATAAACAC TATTGGTCCC CCCCTCTGCA 480 

GNACCCNTTA CTTCCGNANA AAAATTGTTG TCNTGATCCG CGACAACCAC ACCGTCTGTN 540 
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GNTTTTAGTT GCAACNCNNA TCNCTCCAAA AAAGTTTCAG AAATCTTCAT TTTCCCNGGT 



600 



TGAGCCCNTG ACAAACCCCT NAGGATTTGT CGAATGTAAA GTCTCCNGAT CTTCAATAAA 



660 



NNTCCAAAAG NCTANCGAT 



679 
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(2) I N FORMAT I ON FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

NT CNNCTTNN CNNNGGCTGA T AT CNGGCNC TTCNTCCNCG ATCNCAGATA CNNGCNCACC 60 

GGNNNTNTCN GNGGTNATCN TCCNCCATCT CTCNTCCCCG ACNTGCACTC CGGGTNTNNT 12 0 

ACACNGGACA CTGTATCNNA CAGNAAACCT NCCCNGGCCC CAGGGATCAC CATNCCTCGN 18 0 

CCCNGCNTGT NT AT AANAT C AGGNNNTACA TCNANGAACN NACTATCACN GNTCTCTNTT 2 40 

NNCTCAGTGT NCACCTTCCA CTNCNGAANC TNNTCGCTNC NCCNCNGTTG GGAAAGGCGA 300 

NCNGTNCCGG CNACATGCCG TTT NCGNC NT CTGNNCACNT GGGGAT CTNC TNCAANGNAA 3 60 

TCAATTNGNG TAACCCACGG TTTNCNCAAT CACTACTTCT CANNCNANGG CCNTTGAANT 420 

GTTATCCCAC CACCANGGGG CNANTCGGGA CCTNACAATT CATCCTCAGC CGGCCCCAGN 480 

CTTAAAAAAT TCAAAGGNCN CTTGCCCGCN TTNTTNCCTT AGCCCGCCNC CNGACAACAN 540 

CCNANNAACA ACCCCCNNTC TTANGTTGCN NANCC CACAG GANNTTGNNA TACCGGGTTT 600 

CCCCNGAAAC TNCTCAANGC CNCCGTTCCA ACCCCCGTTA CGAAACCGTN CCCNTTTCCT 660 

TCCGAGNTTG CCTATTAANN CCCCCNAAGT TCTNCTTCGT TNGNTTCCTC CGAAANG 717 
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